Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Explanation:

B. preds.write.mode("append").saveAsTable("churn_preds")

Cost-Efficiency: For data that is processed only once per day, a batch write is significantly cheaper than Structured Streaming. A batch job performs the write and then allows the compute resources to shut down immediately, whereas a continuous stream (even if idle) or the overhead of managing streaming checkpoints for daily data is unnecessary.
Historical Integrity: The append mode ensures that each day's predictions are added as new rows to the existing Delta table. This preserves historical data, enabling comparisons across time.
Managed Delta Lake Table: On Databricks, saveAsTable defaults to the Delta format. This provides ACID transactions and time-travel capabilities out of the box.

Option C: The default write mode for Spark is errorIfExists. This snippet would fail on the second day because the path already exists.
Option E: Using overwrite replaces the entire dataset every day, making it impossible to perform the historical comparisons requested by the team.
Options A & D: Using Structured Streaming for data that arrives once a day is inefficient. Even if using Trigger.AvailableNow, the simplicity and lower overhead of a batch append make it the preferred choice for this specific frequency. Additionally, Option A uses overwrite, which destroys the history the team requires.

Explanation:

B. preds.write.mode("append").saveAsTable("churn_preds")

Cost-Efficiency: For data that is processed only once per day, a batch write is significantly cheaper than Structured Streaming. A batch job performs the write and then allows the compute resources to shut down immediately, whereas a continuous stream (even if idle) or the overhead of managing streaming checkpoints for daily data is unnecessary.
Historical Integrity: The append mode ensures that each day's predictions are added as new rows to the existing Delta table. This preserves historical data, enabling comparisons across time.
Managed Delta Lake Table: On Databricks, saveAsTable defaults to the Delta format. This provides ACID transactions and time-travel capabilities out of the box.

Option C: The default write mode for Spark is errorIfExists. This snippet would fail on the second day because the path already exists.
Option E: Using overwrite replaces the entire dataset every day, making it impossible to perform the historical comparisons requested by the team.
Options A & D: Using Structured Streaming for data that arrives once a day is inefficient. Even if using Trigger.AvailableNow, the simplicity and lower overhead of a batch append make it the preferred choice for this specific frequency. Additionally, Option A uses overwrite, which destroys the history the team requires.

Comments (0)

No comments yet.

Real Exam

Last updated: January 6, 2026 at 15:41

(preds.writeStream
      .outputMode("overwrite")
      .option("checkpointPath", "/_checkpoints/churn_preds")
      .start("/preds/churn_preds"))

(preds.writeStream
      .outputMode("overwrite")
      .option("checkpointPath", "/_checkpoints/churn_preds")
      .start("/preds/churn_preds"))

12.5%

preds.write.mode("append").saveAsTable("churn_preds")

preds.write.mode("append").saveAsTable("churn_preds")

50.0%

preds.write.format("delta").save("/preds/churn_preds")

preds.write.format("delta").save("/preds/churn_preds")

15.6%

(preds.writeStream
      .outputMode("append")
      .option("checkpointPath", "/_checkpoints/churn_preds")
      .table("churn_preds"))

(preds.writeStream
      .outputMode("append")
      .option("checkpointPath", "/_checkpoints/churn_preds")
      .table("churn_preds"))

12.5%