Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Question 26 A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.

The code block used by the data engineer is below:

(spark.table("sales")
  .withColumn("avg_price", col("sales") / col("units"))
  .writeStream
  .option("checkpointLocation", checkpointPath)
  .outputMode("complete")
  .______
  .table("new_sales"))

(spark.table("sales")
  .withColumn("avg_price", col("sales") / col("units"))
  .writeStream
  .option("checkpointLocation", checkpointPath)
  .outputMode("complete")
  .______
  .table("new_sales"))

If the data engineer only wants the query to execute a single micro-batch to process all of the available data, which of the following lines of code should the data engineer use to fill in the blank?

Real Exam

Community

LLeetQuiz

trigger(once=True)

trigger(continuous="once")

processingTime("once")

trigger(processingTime="once")

processingTime(1)

Explanation:

Explanation

In Apache Spark Structured Streaming, when you want to execute a streaming query that processes all available data in a single micro-batch and then stops, you should use the trigger(once=True) option.

Why Option A is correct:

trigger(once=True) is specifically designed for this use case
It processes all available data in one batch and then terminates the streaming query
This is commonly used for batch-like processing using the streaming framework

Why other options are incorrect:

B. trigger(continuous="once"): This syntax is incorrect. Continuous processing mode doesn't use "once" as a parameter
C. processingTime("once"): This is invalid syntax. processingTime expects a time interval string like "1 second" or "5 minutes"
D. trigger(processingTime="once"): Invalid syntax - processingTime trigger expects a time interval, not the string "once"
E. processingTime(1): This would trigger the query every 1 millisecond, creating continuous micro-batches, not a single batch

Key Points:

trigger(once=True) is the proper way to execute a one-time batch processing using Structured Streaming
This is useful for scenarios where you want to process all available data at once and then stop
The query will process all data that has accumulated since the last trigger and then terminate

Powered ByGPT-5.2

Comments

Loading comments...