
Answer-first summary for fast verification
Answer: Use 'INSERT OVERWRITE' to overwrite the existing customer data with the new information, as it preserves the table schema and efficiently updates the data without additional storage costs.
Option B is correct because 'INSERT OVERWRITE' is designed to overwrite the data in an existing table without changing the table schema, which aligns with the requirement to update the customer database without altering the schema. It also minimizes operational costs by not requiring additional storage for a new table. Option A is incorrect because 'CREATE OR REPLACE TABLE' would create a new table or replace the existing one, potentially changing the schema, which is not required in this scenario. Option C is incorrect because it suggests an unnecessary step of creating a backup table, which would increase storage costs and complexity. Option D is incorrect because while 'MERGE INTO' offers control, it is more complex and not necessary for the straightforward requirement of overwriting existing data with new data without schema changes.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In the context of incremental data processing within Azure Databricks, consider a scenario where a data engineer is tasked with updating a customer database. The database must be updated with new customer information daily, without altering the existing table schema. Additionally, the solution must ensure data integrity and minimize operational costs. Which of the following commands should the data engineer use to achieve these requirements, and why? Choose the best option from the following:
A
Use 'CREATE OR REPLACE TABLE' to create a new table with the updated customer information, as it allows for schema changes and ensures the latest data is always available.
B
Use 'INSERT OVERWRITE' to overwrite the existing customer data with the new information, as it preserves the table schema and efficiently updates the data without additional storage costs.
C
Use a combination of 'CREATE OR REPLACE TABLE' and 'INSERT OVERWRITE' to first create a backup of the existing table and then update it with new data, ensuring data safety and integrity.
D
Neither 'CREATE OR REPLACE TABLE' nor 'INSERT OVERWRITE' is suitable for this scenario; instead, use 'MERGE INTO' to update the customer database, as it provides more control over the data update process.