
Explanation:
To understand the correct answer, let's explore Trigger Intervals in Structured Streaming. The trigger method determines when the system should process the next set of data. Triggers control the frequency of micro-batches, with Spark automatically detecting and processing new data since the last trigger by default. The Trigger.AvailableNow option, introduced in DBR 10.1 for Scala and DBR 10.2 for Python and Scala, is designed to process all available data in micro-batches and then stop, making trigger(availableNow=True) the correct choice for this scenario.
Ultimate access to all questions.
No comments yet.
A data engineer has set up a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table. The provided code block is missing a crucial line to complete the operation. The goal is to have the query execute in multiple micro-batches, process all available data, and then stop automatically. Which of the following code lines should fill the blank to achieve this?
A
trigger(processingTime="500ms")
B
trigger(availableNow=True)
C
trigger(once=True)
D
trigger(processingTime="now")
E
trigger(available_Now="True")