
Answer-first summary for fast verification
Answer: No, because Delta Lake supports time travel. The deleted data remains in the underlying Parquet files and can still be accessed via historical version queries until a `VACUUM` command is run.
Executing a `DELETE` statement in Delta Lake removes the specified rows from the **current** version of the table. However, it does not physically delete the underlying Parquet files from cloud storage immediately. This architecture is what enables Delta Lake's **Time Travel** feature, allowing users to query older versions of the table (e.g., `SELECT * FROM users VERSION AS OF <ID>`). To ensure the data is permanently and physically removed from storage (for example, to comply with GDPR/CCPR 'Right to be Forgotten' requests), a `VACUUM` command must be executed. `VACUUM` removes data files that are no longer referenced by the current snapshot of the table and are older than the retention period (default 168 hours or 7 days).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data privacy team is auditing a process for deleting user records from a Delta Lake table named users. The process uses a secondary table, delete_requests_table, to identify the records to be purged.
Consider the following SQL command:
DELETE FROM users
WHERE user_id IN (SELECT user_id FROM delete_requests_table)
DELETE FROM users
WHERE user_id IN (SELECT user_id FROM delete_requests_table)
Assuming user_id is a unique identifier, does successfully executing this code guarantee that the data for these users is permanently inaccessible from the storage layer? Why or why not?
A
Yes, because Delta Lake’s ACID transactions ensure that once a deletion is committed, the data is immediately and completely purged from the underlying storage.
B
No, because the Delta cache may continue to serve the deleted data to users for a period of time before the cache is automatically refreshed.
C
No, because Delta Lake supports time travel. The deleted data remains in the underlying Parquet files and can still be accessed via historical version queries until a VACUUM command is run.
D
No, because the DELETE operation in Delta Lake is only transactionally safe if it is executed as part of a MERGE INTO statement.
E
No, because Delta Lake only marks the rows as deleted in the transaction log; the data remains accessible to anyone with read permissions on the storage container regardless of time travel settings.