
Explanation:
In Spark Structured Streaming, the processingTime trigger method is used to process data in micro-batches at user-specified intervals. The correct syntax for specifying a 2-minute interval is trigger(processingTime="2 minutes"). This method allows you to control the batch processing intervals, with the default being "500ms". For more details, refer to the Structured Streaming Triggers documentation.
Ultimate access to all questions.
No comments yet.
Consider the following Structured Streaming query in Databricks:
spark.table("orders")
.withColumn("total_after_tax", col("total") + col("tax"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
._____________
.table("new_orders")
spark.table("orders")
.withColumn("total_after_tax", col("total") + col("tax"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
._____________
.table("new_orders")
Which option correctly fills in the blank to execute a micro-batch processing of data every 2 minutes?
A
trigger(once="2 minutes")
B
trigger(processingTime="2 minutes")
C
processingTime("2 minutes")
D
trigger("2 minutes")
E
trigger()