
Answer-first summary for fast verification
Answer: trigger(availableNow=True)
In Structured Streaming, if a data engineer wants to process all the available data in as many batches as required without any explicit trigger interval, they can use the option trigger(availableNow=True). This feature, availableNow, is used to specify that the query should process the data that is available at the moment and not wait for more data to arrive.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineer is tasked with creating a Structured Streaming job in Databricks that reads from an existing table, processes the data, and writes the processed data into a new table in real time. The aim is to ensure that the streaming query handles all the available data by processing it in the necessary number of batches. Below is the code block used by the data engineer. Identify which line of code should be inserted to ensure that all available data is processed in multiple batches.
A
processingTime(1)
B
trigger(availableNow=True)
C
trigger(parallelBatch=True)
D
trigger(processingTime="once")
E
trigger(continuous="once")