
Explanation:
In Delta Live Tables (DLT), the ON VIOLATION clause determines how constraint violations are handled. The specific syntax used in the question is:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
The ON VIOLATION FAIL directive means that when any record violates the constraint, the entire job fails. This is a strict enforcement mechanism that ensures data quality by preventing any invalid data from being processed.
ON VIOLATION FAIL: This is the critical part - it causes the pipeline to fail when constraints are violated.
Comparison with other options:
ON VIOLATION DROP AND COUNT or similar quarantine mechanisms.DLT Constraint Types:
ON VIOLATION FAIL: Job fails on violationON VIOLATION DROP ROW: Violating rows are droppedON VIOLATION DROP AND COUNT: Violating rows are dropped and countedThis behavior ensures data quality by enforcing strict validation before data enters the target dataset, which is crucial for production data pipelines.
Ultimate access to all questions.
A dataset has been defined using Delta Live Tables and includes an expectations clause:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION FAIL UPDATE
What is the expected behavior when a batch of data containing data that violates these constraints is processed?
A
Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
B
Records that violate the expectation cause the job to fail.
C
Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.
D
Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
No comments yet.