
Answer-first summary for fast verification
Answer: MERGE
For Type 2 slowly changing dimensions (SCD) in Azure Databricks with Delta Lake tables, the MERGE operation is the optimal choice because it supports complex conditional logic required for SCD Type 2 implementations. Type 2 SCD requires maintaining historical data by inserting new rows for changed records while keeping the old versions, rather than updating existing rows. The MERGE operation can handle multiple conditions in a single atomic transaction: - **WHEN MATCHED** clauses can update existing records (e.g., setting end dates for previous versions) - **WHEN NOT MATCHED** clauses can insert new records for changed dimensions - **WHEN NOT MATCHED BY SOURCE** clauses can handle deletions or archival This provides ACID transaction guarantees essential for data consistency in dimension tables. Other operations are less suitable: - **CREATE**: Only creates new tables, cannot update existing data - **UPDATE**: Only modifies existing rows, cannot insert new records needed for Type 2 SCD history - **ALTER**: Modifies table structure, not the data content The MERGE operation's ability to handle multiple data modification operations in a single statement makes it ideal for the complex requirements of Type 2 SCD implementations in Delta Lake environments.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You have an Azure Databricks workspace containing a Delta Lake dimension table named Table1, which is a Type 2 slowly changing dimension (SCD) table. You need to apply updates from a source table to Table1.
Which Apache Spark SQL operation should you use?
A
CREATE
B
UPDATE
C
ALTER
D
MERGE