Ultimate access to all questions.
Consider the following Structured Streaming query in Databricks:
spark.table("orders")
.withColumn("total_after_tax", col("total") + col("tax"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
._____________
.table("new_orders")
Which option correctly fills in the blank to execute a micro-batch processing of data every 2 minutes?
Explanation:
In Spark Structured Streaming, the processingTime
trigger method is used to process data in micro-batches at user-specified intervals. The correct syntax for specifying a 2-minute interval is trigger(processingTime="2 minutes")
. This method allows you to control the batch processing intervals, with the default being "500ms". For more details, refer to the Structured Streaming Triggers documentation.