Databricks Certified Machine Learning - Associate

Ultimate access to all questions.

A data engineering team is utilizing Databricks to write data to a Delta table named 'sensor_readings'. Their goal is to ensure that any duplicate records, based on a specific key column, are removed before writing to the table. Which of the following code snippets should they use to achieve this efficiently?

Real Exam

sensor_readings.distinct().write.format('delta').mode('overwrite').saveAsTable('sensor_readings')

13.0%

spark.sql('SELECT DISTINCT * FROM sensor_readings').write.format('delta').mode('overwrite').saveAsTable('sensor_readings')

21.7%

Loading comments...

sensor_readings.dropDuplicates('key_column').write.format('delta').mode('append').saveAsTable('sensor_readings')

47.8%