
Answer-first summary for fast verification
Answer: Use the ALTER TABLE sales_data ADD COLUMNS SQL command to add the new columns, then ingest the new data files.
Use the ALTER TABLE sales_data ADD COLUMNS SQL command to add the new columns, then ingest the new data files. Reference Explanation: Zero Downtime: ALTER TABLE ... ADD COLUMNS is an online operation in Delta Lake, so the table remains available for reads and writes. Data Consistency: Adding columns explicitly ensures the schema is updated before new data is ingested, preventing schema mismatch errors. Auditability: Delta Lake transaction logs record all schema changes, so you can track when and how the schema was altered. Best Practices: Explicit schema management is recommended in production to avoid accidental schema drift and maintain control over table structure. Why other options are less suitable: B: While mergeSchema can automatically evolve the schema, it is less auditable and can introduce accidental schema changes if not carefully managed, which is not recommended for production. C: Dropping and recreating the table causes downtime, risks data loss, and is not necessary for adding columns. D: Maintaining two tables and unioning them complicates data management, increases maintenance overhead, and is not scalable for production.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are managing a production Delta Lake table, sales_data, which is used by multiple downstream analytics jobs. The table is partitioned by region and sale_date. Recently, your team needs to ingest new data files that include two additional columns: discount_code (string) and promotion_flag (boolean), which were not present in the original schema.
Requirements:
Which approach best satisfies all requirements? Select the best answer and explain why the other options are less suitable.
A
Use the ALTER TABLE sales_data ADD COLUMNS SQL command to add the new columns, then ingest the new data files.
B
Ingest the new data files using a DataFrame write operation with mergeSchema enabled, and rely on Delta Lake’s automatic schema evolution.
C
Drop and recreate the sales_data table with the new schema, then reload all historical and new data.
D
Create a new Delta table with the updated schema, ingest new data there, and union it with the old table for queries.
No comments yet.