
Databricks Certified Data Engineer - Associate
Get started today
Ultimate access to all questions.
When using the INSERT OVERWRITE
command to reload the customer_sales
table, how does it affect the ability to review previous versions of the data?
When using the INSERT OVERWRITE
command to reload the customer_sales
table, how does it affect the ability to review previous versions of the data?
Explanation:
The INSERT OVERWRITE
command overwrites the current version of the data but preserves all historical versions, allowing you to time travel to previous versions. This means you can still query the prior version of the data using time travel. Any DML/DDL operation (except DROP TABLE) on the Delta table preserves the historical version of the data. For example, to query the previous version of the customer_sales
table, you can use: SELECT * FROM customer_sales as of 1
. To see all historical changes on the table, use: DESCRIBE HISTORY table_name
. Note that the main difference between INSERT OVERWRITE
and CREATE OR REPLACE TABLE (CRAS)
is that CRAS can modify the schema of the table, whereas INSERT OVERWRITE
by default only overwrites the data unless spark.databricks.delta.schema.autoMerge.enabled
is set to true.