Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


Consider a scenario where you need to update a Delta Lake table with new data from an external source, ensuring that existing records are updated and new records are inserted without causing duplicates. Which SQL command would you use to achieve this, and why is it the best choice for this scenario?




Explanation:

MERGE is the best choice for this scenario as it allows for conditional operations where existing records can be updated and new records can be inserted based on specific conditions, ensuring that no duplicates are created. This is more efficient and accurate than other methods like CREATE OR REPLACE TABLE or INSERT OVERWRITE, which do not support conditional operations.