
Answer-first summary for fast verification
Answer: Incorporating comprehensive data quality checks and validation logic at each processing stage, including schema validation, null checks, and custom business rule validations.
Option B is the correct answer because it addresses the need for rigorous data quality checks and validation logic at every stage of the pipeline, which is critical for ensuring data consistency and quality in a regulated financial services environment. This approach also supports scalability by allowing for the detection and correction of issues early in the pipeline. While options A, C, and D offer some strategies for managing data, they either lack the comprehensive checks needed for regulatory compliance or fail to address the complexity and volume of data typical in financial services.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In the context of designing a data pipeline for a financial services company that processes high-volume transactions from multiple sources into a Delta Lake table, which of the following approaches BEST ensures data quality and consistency while adhering to regulatory compliance and scalability requirements? (Choose one option)
A
Relying solely on a single source of truth and minimizing data duplication without implementing any additional data quality checks.
B
Incorporating comprehensive data quality checks and validation logic at each processing stage, including schema validation, null checks, and custom business rule validations.
C
Utilizing Delta Lake's ACID transactions and MERGE INTO statements for upserts to ensure data consistency, without explicit data quality checks.
D
Denormalizing all data and eliminating lookup tables to simplify the pipeline, assuming this will inherently ensure data quality and consistency.
No comments yet.