
Answer-first summary for fast verification
Answer: Use a batch processing approach with Delta Lake to handle incremental processing and deduplication.
Option A is correct because it describes a practical approach to handling batch data with Delta Lake, leveraging its features like MERGE INTO for deduplication and auto-compaction for incremental processing.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with implementing a data pipeline that processes batch data from a financial institution. The data needs to be processed incrementally and deduplicated. Describe the architecture and operations necessary to achieve this using Delta Lake with batch workloads.
A
Use a batch processing approach with Delta Lake to handle incremental processing and deduplication.
B
Implement a streaming architecture with Delta Lake using MERGE INTO for deduplication and auto-compaction for incremental processing.
C
Use a combination of Kafka and Delta Lake for batch data without handling deduplication.
D
Implement a custom Python script to handle batch data and deduplication without using Delta Lake.
No comments yet.