
Answer-first summary for fast verification
Answer: Use the Delta Lake `MERGE INTO` statement to update the silver layer with new or changed records from the bronze layer, ensuring data quality by enforcing schema constraints.
Correct Answer: A — Use the Delta Lake MERGE INTO statement. MERGE INTO is the recommended Delta Lake approach for incremental upserts from bronze to silver. It: Processes only new or changed records → cost-effective. Handles deduplication by matching on a unique key. Enforces schema constraints to maintain data quality. Unlike append mode (Option D), it prevents duplicates and avoids rewriting the entire dataset (like Option B or C). ```sql MERGE INTO silver_table AS s USING bronze_table_updates AS b ON s.id = b.id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * ```
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a data pipeline in Azure Databricks to incrementally process data from a bronze to a silver layer using Delta Lake. The pipeline must ensure data quality, handle deduplication, and be cost-effective. Which of the following approaches BEST meets these requirements? Choose one option.
A
Use the Delta Lake MERGE INTO statement to update the silver layer with new or changed records from the bronze layer, ensuring data quality by enforcing schema constraints.
B
Implement a custom deduplication logic using a combination of SELECT DISTINCT and OVERWRITE statements, which may increase processing time and costs.
C
Leverage the Delta Lake WRITE statement with the overwriteSchema option to ensure schema enforcement and prevent data quality issues, but this may not handle deduplication effectively.
D
Utilize the Delta Lake READ and WRITE statements with the append mode to incrementally process data while maintaining data quality and deduplication, optimizing for cost and performance.
No comments yet.