
Ultimate access to all questions.
You are a data engineer responsible for designing a data pipeline that processes customer transactions in real-time for a financial services company. The pipeline must ensure high data integrity and compliance with financial regulations. During the design phase, you are evaluating how to handle constraint violations, such as duplicate transactions or transactions exceeding account limits. The pipeline must continue to process valid transactions without significant delays, even when violations occur. Considering these requirements, which of the following mechanisms would you implement to handle constraint violations effectively? Choose the best option.
A
Implement 'ON VIOLATION DROP ROW' to silently drop violating rows without logging, ensuring the pipeline continues processing at maximum speed.
B
Implement 'ON VIOLATION FAIL UPDATE' to halt the pipeline immediately upon any violation, ensuring no invalid data is processed, regardless of the impact on processing speed.
C
Implement 'ON VIOLATION DROP ROW' with detailed logging of each violation, allowing the pipeline to continue processing while enabling later analysis of dropped rows.
D
Implement 'ON VIOLATION FAIL UPDATE' with a retry mechanism for violations, attempting to resolve them automatically before failing the pipeline.