
Ultimate access to all questions.
In the context of designing a data pipeline for a financial services company that processes high-volume transactions from multiple sources into a Delta Lake table, which of the following approaches BEST ensures data quality and consistency while adhering to regulatory compliance and scalability requirements? (Choose one option)
A
Relying solely on a single source of truth and minimizing data duplication without implementing any additional data quality checks.
B
Incorporating comprehensive data quality checks and validation logic at each processing stage, including schema validation, null checks, and custom business rule validations.
C
Utilizing Delta Lake's ACID transactions and MERGE INTO statements for upserts to ensure data consistency, without explicit data quality checks.
D
Denormalizing all data and eliminating lookup tables to simplify the pipeline, assuming this will inherently ensure data quality and consistency.