
Explanation:
SELECT DISTINCT * FROM table_name will filter out duplicate records from the query execution result set in Spark SQL. DROP DUPLICATES is a DataFrame API function (e.g., df.dropDuplicates()), not a standard SQL command. To permanently remove duplicates, one would typically use a MERGE operation or window functions to filter and overwrite the data.
Ultimate access to all questions.
No comments yet.