LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


A junior data engineer is using the following code to de-duplicate raw streaming data and insert them into a target Delta table.

spark.readStream.table("orders_raw")
.dropDuplicates(["order_id", "order_timestamp"])
.writeStream .option("checkpointLocation", "dbfs: /checkpoints")
.table("orders_unique")

However, a senior data engineer points out that this approach may not suffice for ensuring distinct records in the target table, especially with late-arriving duplicates. What could explain the senior engineer's concern?

Real Exam
Quiz related visual




Powered ByGPT-5