
Answer-first summary for fast verification
Answer: The `COPY INTO` command is idempotent, meaning it skips files in the source location that have already been loaded.
The `COPY INTO` SQL command is designed to load data from a file location into a Delta table in a re-triable and idempotent manner. This means that files already loaded from the source location are skipped to prevent duplicate records in the table. To override this behavior and force the ingestion of previously loaded files, you can set the `force` parameter to `true` within `COPY_OPTIONS` as follows: ``` COPY INTO my_table FROM ‘dbfs:/mnt/retail/bronze/*.csv‘ FILEFORMAT = CSV COPY_OPTIONS (‘force‘ = ‘true‘) ``` This disables idempotency, allowing files to be loaded regardless of their previous loading status.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A data engineer loaded data from external CSV files into a Delta table using the COPY INTO command as shown below:
COPY INTO my_table
FROM ‘dbfs:/mnt/retail/bronze/*.csv‘
FILEFORMAT = CSV
COPY INTO my_table
FROM ‘dbfs:/mnt/retail/bronze/*.csv‘
FILEFORMAT = CSV
After successfully running the command for the first time, it was immediately run again, but no data was loaded the second time. What could be the reason for this?
A
Execute the REFRESH TABLE command before running COPY INTO again.
B
The COPY INTO command is designed for one-time data ingestion into a Delta table and cannot handle incremental data.
C
Use AUTO LOADER instead of COPY INTO for incremental data ingestion.
D
The COPY INTO command is idempotent, meaning it skips files in the source location that have already been loaded.
E
Replace the COPY INTO command with INSERT OVERWRITE to ensure data is loaded.
No comments yet.