Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


A data engineer has implemented an ETL pipeline using Delta Live Tables to manage travel reimbursement details. They need to ensure the pipeline terminates when location details are missing from employee submissions.

What is the difference between using ON VIOLATION DROP ROW and ON VIOLATION FAIL UPDATE for handling constraint violations in this scenario, and which approach should be used to meet the requirement?




Explanation:

The requirement is to terminate the pipeline if location details are missing (location is NULL). In Delta Live Tables, constraints use ON VIOLATION clauses to define behavior.

  • Option D correctly uses ON VIOLATION FAIL, which stops the pipeline upon violation.
  • Option B uses ON VIOLATION FAIL UPDATE, but FAIL UPDATE is not valid syntax; only FAIL is needed.
  • **Option Achecks forlocation = NULL` (incorrect condition) and uses invalid syntax.
  • **Option CusesON DROP ROW` (incorrect syntax) and drops violating rows instead of failing the pipeline.