
Answer-first summary for fast verification
Answer: They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.
## Explanation **Correct Answer: D** - They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics. **Why Option D is correct:** 1. In Delta Live Tables (DLT), each table in the pipeline has its own data quality metrics and statistics. 2. By navigating to the DLT pipeline page in the Databricks workspace, the data engineer can click on individual tables to view detailed information about data quality expectations and the number of records that passed or failed each expectation. 3. This approach allows the engineer to identify exactly which table is dropping records by examining the data quality statistics for each table in the pipeline. **Why other options are incorrect:** **Option A:** Setting up separate expectations for each table is a development-time configuration, not a monitoring approach. While expectations define what constitutes valid/invalid data, they don't help identify which table is currently dropping records during pipeline execution. **Option B:** The "Error" button typically shows pipeline execution errors or failures, not data quality drops. Dropped records due to quality concerns are not considered errors in the traditional sense - they're expected behavior when expectations are violated. **Option C:** While DLT can be configured to send notifications, email notifications don't provide the granular detail needed to identify which specific table is dropping records. Notifications typically alert about pipeline failures or completions, not detailed data quality statistics per table. **Key Concept:** Delta Live Tables provides built-in data quality monitoring through the Databricks UI, where users can view expectation metrics for each table, including counts of records that passed and failed expectations, helping identify where data is being filtered out in the pipeline.
Author: Keng Suppaseth
Ultimate access to all questions.
No comments yet.
A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.
Which approach can the data engineer take to identify the table that is dropping the records?
A
They can set up separate expectations for each table when developing their DLT pipeline.
B
They can navigate to the DLT pipeline page, click on the "Error" button, and review the present errors.
C
They can set up DLT to notify them via email when records are dropped.
D
They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.