
Ultimate access to all questions.
In the context of designing a data pipeline for a financial services company that requires real-time fraud detection, you are tasked with implementing a solution that processes streaming transaction data and writes it to a Delta Lake table for analysis. The solution must ensure data consistency, support high throughput, and comply with regulatory requirements. Considering these constraints, which of the following approaches BEST leverages Delta Lake's capabilities to meet the requirements? (Choose one option)
A
Utilize Delta Lake's batch processing features by aggregating streaming data into large batches before writing to the Delta Lake table, to minimize the number of transactions.
B
Since Delta Lake is not designed for streaming data, implement a custom solution outside of Delta Lake to process the streaming data before batch loading it into Delta Lake.
C
Directly write streaming data to Delta Lake using its transactional capabilities, such as the MERGE INTO statement, to ensure data consistency without the need for external systems.
D
Integrate a streaming platform like Apache Kafka with Delta Lake, using Kafka to ingest and process the streaming data in real-time, and Delta Lake to store and manage the data with ACID transactions.