Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

The machine learning team needs to optimize the workflow for identifying changed records in the `customer_churn_params` table to trigger updates for their churn prediction model. Which method would most effectively streamline the identification of these records for incremental processing?

Real Exam

Last updated: January 6, 2026 at 15:40

Calculate the difference between the previous model predictions and the current customer_churn_params using a unique customer key before making new predictions, only processing customers not found in the previous set.

0.0%

Modify the overwrite logic to include a field populated by spark.sql.functions.current_timestamp() during the write process, then use this field to filter for records written on a specific date.

Comments

Loading comments...

Replace the current overwrite logic with a MERGE statement and enable the Delta Lake Change Data Feed (CDF) to identify and process only those records that have been inserted or updated.

75.0%

Convert the batch job to a Structured Streaming job using complete output mode to read from the customer_churn_params table and incrementally predict against the churn model.

12.5%

Apply the churn model to all rows in the customer_churn_params table, but implement logic to perform an upsert into the predictions table that ignores rows where predictions have not changed.

12.5%

Databricks Certified Data Engineer - Professional

Get started today

The machine learning team needs to optimize the workflow for identifying changed records in the customer_churn_params table to trigger updates for their churn prediction model. Which method would most effectively streamline the identification of these records for incremental processing?

Comments

The machine learning team needs to optimize the workflow for identifying changed records in the `customer_churn_params` table to trigger updates for their churn prediction model. Which method would most effectively streamline the identification of these records for incremental processing?