
Ultimate access to all questions.
A dataset has been defined using Delta Live Tables and includes an expectations clause:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
What is the expected behavior when a batch of data containing data that violates these constraints is processed?
A
Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
B
Records that violate the expectation cause the job to fail.
C
Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.
D
Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
Explanation:
In Delta Live Tables (DLT), the ON VIOLATION clause determines how constraint violations are handled. The specific syntax used in the question is:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
The ON VIOLATION FAIL directive means that when any record violates the constraint, the entire job fails. This is a strict enforcement mechanism that ensures data quality by preventing any invalid data from being processed.
ON VIOLATION FAIL: This is the critical part - it causes the pipeline to fail when constraints are violated.
Comparison with other options:
ON VIOLATION DROP AND COUNT or similar quarantine mechanisms.DLT Constraint Types:
ON VIOLATION FAIL: Job fails on violationON VIOLATION DROP ROW: Violating rows are droppedON VIOLATION DROP AND COUNT: Violating rows are dropped and countedThis behavior ensures data quality by enforcing strict validation before data enters the target dataset, which is crucial for production data pipelines.