Databricks Certified Data Engineer - Associate

Ultimate access to all questions.

You are working with a large dataset stored in a Delta Lake table named 'user_data'. The table contains duplicate entries that need to be removed. Write a Spark SQL query to create a new table 'unique_user_data' that contains only the distinct rows from 'user_data'.

Simulated

CREATE TABLE unique_user_data AS SELECT DISTINCT * FROM user_data

87.4%

CREATE TABLE unique_user_data AS SELECT * FROM user_data GROUP BY *

3.6%

Loading comments...

CREATE TABLE unique_user_data AS SELECT * FROM user_data UNION SELECT * FROM user_data

2.8%