
Databricks Certified Data Engineer - Associate
Get started today
Ultimate access to all questions.
Consider a scenario where you need to update a Delta Lake table with new data from an external source, ensuring that existing records are updated and new records are inserted without causing duplicates. Which SQL command would you use to achieve this, and why is it the best choice for this scenario?
Consider a scenario where you need to update a Delta Lake table with new data from an external source, ensuring that existing records are updated and new records are inserted without causing duplicates. Which SQL command would you use to achieve this, and why is it the best choice for this scenario?
Simulated
Explanation:
MERGE is the best choice for this scenario as it allows for conditional operations where existing records can be updated and new records can be inserted based on specific conditions, ensuring that no duplicates are created. This is more efficient and accurate than other methods like CREATE OR REPLACE TABLE or INSERT OVERWRITE, which do not support conditional operations.