Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

A data engineer is designing an ETL workflow to handle late-arriving and potentially duplicate records from a single data source. While batch-level deduplication is feasible, the engineer needs a method to deduplicate incoming data against records already residing in the target Delta table. Which approach allows the engineer to deduplicate data against previously processed records during the insertion process?

Real Exam

Last updated: January 6, 2026 at 15:40

Configure the table property delta.deduplicate to true.

9.5%

Execute a VACUUM operation on the Delta table after each batch completes.

4.8%

Loading comments...

Implement an 'insert-only' MERGE operation with a matching condition on a unique key.

81.0%