
Ultimate access to all questions.
In a scenario where you are tasked with integrating data from multiple sources into a single target table within a Databricks environment, you need to ensure the solution is efficient, handles duplicates, and manages deletions appropriately. Considering the constraints of cost, compliance, and scalability, which of the following commands is the BEST choice for this task and why? Choose the most appropriate option.
A
CREATE OR REPLACE TABLE, because it simplifies the process by allowing you to either create a new table or replace an existing one with the combined data, but it does not efficiently handle duplicates or deletions.
B
INSERT OVERWRITE, because it efficiently overwrites the target table with new data from multiple sources, but it lacks the capability to handle duplicates and deletions properly.
C
MERGE, because it not only combines data from multiple sources into the target table but also efficiently handles duplicates and deletions, ensuring data integrity and compliance.
D
COPY INTO, because it is designed for inserting data from multiple sources into a target table while preventing duplication, but it does not manage deletions or updates as effectively as other methods.