Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

A data engineer is designing a pipeline to handle late-arriving and duplicate records. Beyond de-duplicating data within the current micro-batch, which technique effectively prevents duplicate records from being inserted into an existing Delta table by checking against previously processed data?

Real Exam

Last updated: January 6, 2026 at 15:43

Perform a full outer join on a unique key and overwrite the entire target table with the result.

12.5%

Enable Delta Lake schema enforcement to block duplicate records during the write operation.

12.5%

Loading comments...

Use a MERGE INTO operation with a WHEN NOT MATCHED clause based on a unique key.

62.5%

Execute the VACUUM command on the Delta table after each batch to remove stale duplicates.

12.5%