
Ultimate access to all questions.
Which of the following commands will return records from an existing Delta table my_table where duplicates have been removed?
A
DROP DUPLICATES FROM my_table;
B
SELECT * FROM my_table WHERE duplicate = False;
C
SELECT DISTINCT * FROM my_table;
D
MERGE INTO my_table a USING new_records b ON a.id = b.id WHEN NOT MATCHED THEN INSERT *;
E
MERGE INTO my_table a USING new_records b;
Explanation:
Let's analyze each option:
A. DROP DUPLICATES FROM my_table;
DROP DUPLICATES statement.B. SELECT * FROM my_table WHERE duplicate = False;
duplicate in the table, which is not guaranteed. This would only filter rows where a specific column has value False, not remove duplicates.C. SELECT DISTINCT * FROM my_table; ✓
SELECT DISTINCT * statement returns all unique rows from the table by comparing all columns. It removes duplicate rows where all column values are identical.D. MERGE INTO my_table a USING new_records b ON a.id = b.id WHEN NOT MATCHED THEN INSERT *;
new_records into my_table when they don't already exist (based on id match). This doesn't remove duplicates from the existing table; it only prevents inserting duplicates from a source table.E. MERGE INTO my_table a USING new_records b;
Key Points:
SELECT DISTINCT * is the standard SQL way to return unique rows from a table.DROP DUPLICATES as part of a Delta Live Tables pipeline or use DELETE with window functions to remove duplicates, but for simply returning records without duplicates, SELECT DISTINCT * is the correct choice.