
Ultimate access to all questions.
In the context of designing a data pipeline on Azure Databricks that utilizes constraints to ensure data integrity, consider a scenario where a batch of data fails to meet the defined constraints. The pipeline is configured to process financial transactions for a banking application, where data accuracy and integrity are paramount due to regulatory compliance requirements. Given the need to maintain high data quality and adhere to strict compliance standards, what is the most appropriate default behavior of the pipeline when a constraint violation is encountered? Choose the best option.
A
The pipeline will continue processing the batch, logging the constraint violation for later review, but not stopping the overall process.
B
The pipeline will halt processing immediately upon detecting a constraint violation, raising an error to alert the data engineering team.
C
The pipeline will attempt to automatically correct the data that violates the constraints and proceed with processing the corrected batch.
D
The pipeline will skip the problematic records, processing only those that meet all constraints, and generate a report of skipped records.