
Explanation:
Let's analyze each option:
A. DROP DUPLICATES FROM my_table; - This is not a valid SQL command. There is no DROP DUPLICATES statement in standard SQL or Databricks SQL.
B. SELECT * FROM my_table WHERE duplicate = False; - This assumes there's a column named duplicate in the table, which is unlikely. Even if such a column existed, it wouldn't necessarily remove all duplicates from the result set.
C. SELECT DISTINCT * FROM my_table; - ✅ CORRECT - This command will return all records from my_table but will remove duplicate rows (rows where all column values are identical). The DISTINCT keyword ensures that only unique rows are returned.
D. MERGE INTO my_table a USING new_records b ON a.id = b.id WHEN NOT MATCHED THEN INSERT *; - This is a MERGE operation that inserts new records from new_records into my_table where they don't already exist. It doesn't remove duplicates from the existing table.
E. MERGE INTO my_table a USING new_records b; - This is an incomplete MERGE statement that will result in a syntax error because it's missing the required ON clause and WHEN conditions.
Therefore, only option C (SELECT DISTINCT * FROM my_table;) will return records from the existing Delta table with duplicates removed.
Ultimate access to all questions.
No comments yet.
Question 18
Which of the following commands will return records from an existing Delta table my_table where duplicates have been removed?
A
DROP DUPLICATES FROM my_table;
B
SELECT * FROM my_table WHERE duplicate = False;
C
SELECT DISTINCT * FROM my_table;
D
MERGE INTO my_table a USING new_records b ON a.id = b.id WHEN NOT MATCHED THEN INSERT *;
E
MERGE INTO my_table a USING new_records b;