LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


Which approach enables a data engineer to deduplicate incoming records against previously processed data when inserting into a Delta table, in addition to handling intra-batch deduplication for late-arriving records?

Exam-Like



Explanation:

The correct approach is C because performing an insert-only merge (using MERGE command) with a matching condition on a unique key ensures that new records are only inserted if they do not already exist in the Delta table. This effectively deduplicates against previously processed records.

  • A is incorrect because schema enforcement checks data types and structure, not duplicates.
  • B (VACUUM) manages file storage and does not address deduplication.
  • D suggests a full outer join and overwrite, which is inefficient and risky compared to the atomic MERGE operation.
Powered ByGPT-5