
Explanation:
For Type 2 slowly changing dimensions (SCD) in Azure Databricks with Delta Lake tables, the MERGE operation is the optimal choice because it supports complex conditional logic required for SCD Type 2 implementations. Type 2 SCD requires maintaining historical data by inserting new rows for changed records while keeping the old versions, rather than updating existing rows. The MERGE operation can handle multiple conditions in a single atomic transaction:
This provides ACID transaction guarantees essential for data consistency in dimension tables. Other operations are less suitable:
The MERGE operation's ability to handle multiple data modification operations in a single statement makes it ideal for the complex requirements of Type 2 SCD implementations in Delta Lake environments.
Ultimate access to all questions.
You have an Azure Databricks workspace containing a Delta Lake dimension table named Table1, which is a Type 2 slowly changing dimension (SCD) table. You need to apply updates from a source table to Table1.
Which Apache Spark SQL operation should you use?
A
CREATE
B
UPDATE
C
ALTER
D
MERGE
No comments yet.