
Answer-first summary for fast verification
Answer: Combine chunks → Convert to DataFrame → Write to Delta Lake in Overwrite mode
The correct sequence is **B** because: 1. **Combine chunks** - First, you need to combine the chunked text data into a unified dataset 2. **Convert to DataFrame** - Then convert the combined data into a DataFrame format that can be processed by Spark 3. **Write to Delta Lake in Overwrite mode** - Finally, write the DataFrame to Delta Lake using Overwrite mode, which is efficient for initial data loading and replaces any existing data This approach is efficient because: - Combining chunks first avoids creating multiple small files - Converting to DataFrame enables Spark's distributed processing - Overwrite mode is optimal for initial data loading scenarios where you want to replace existing data Other options are incorrect: - **A** includes unnecessary schema definition and uses Merge mode which is for upsert operations - **C** converts to DataFrame before combining, which is inefficient for chunked data - **D** creates schema separately and uses Append mode which is for adding to existing data
Author: LeetQuiz .
Ultimate access to all questions.
Question: 5
You are tasked with writing a large, chunked text dataset into Delta Lake tables within Unity Catalog. The data needs to be prepared efficiently for querying and analysis. Which of the following is the correct sequence of operations to write the chunked text data into a Delta Lake table?
A
Combine chunks → Convert to DataFrame → Define Delta Table schema → Write to Delta Lake in Merge mode
B
Combine chunks → Convert to DataFrame → Write to Delta Lake in Overwrite mode
C
Convert to DataFrame → Combine chunks → Write to Delta Lake in Append mode
D
Combine chunks → Create Delta Table schema → Write to Delta Lake in Append mode
No comments yet.