
Answer-first summary for fast verification
Answer: 500 milliseconds
Understanding the default trigger interval is key to answering this question. Trigger intervals determine how often the system processes the next batch of data in a streaming write operation. If the trigger method is not explicitly set, Spark defaults to processing new data as soon as it's available, which is equivalent to setting `processingTime="500ms"`. Therefore, the correct answer is **500 milliseconds**.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A data engineer has set up a Structured Streaming job to read from a table, process the data, and then write it into a new table in a streaming fashion. The code snippet used is as follows:
spark.table("sales")
.withColumn("avg_price", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("new_sales")
spark.table("sales")
.withColumn("avg_price", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("new_sales")
If the trigger method is not specified in the code, what is the default processingTime the system will use for processing the next batch of data?
A
50 seconds
B
5 minutes
C
500 milliseconds
D
0.5 hours
E
5000 microseconds
No comments yet.