Ultimate access to all questions.
A data engineer has set up a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table. The code block used is as follows:
spark.table("sales")
.withColumn("avg_price", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
._________
.table("new_sales")
If the goal is for the query to execute in micro-batches and be triggered every 5 minutes, which of the following lines of code should fill in the blank?