
Answer-first summary for fast verification
Answer: Define the data quality rules in a centralized Databricks notebook or Python file and import them as a library within each DLT notebook.
Delta Live Tables supports portable and reusable expectations by defining quality rules in a shared Python (or SQL) module that is imported into each pipeline notebook. **Key benefits of this approach:** * **Centralization:** Manage all expectations in a single location. * **Reduced Boilerplate:** Reuse the same `@dlt.expect` or `CONSTRAINT` declarations without manual copying. * **Integration:** Reusable modules integrate natively with DLT's built-in monitoring and lineage features. **Why other options are incorrect:** * **Global variables (Option C):** Notebooks in a DLT pipeline have isolated scopes; global variables are not a stable or supported mechanism for code reuse across files. * **Delta table storage (Option B):** Storing rules as data adds runtime complexity, requiring additional parsing logic and moving away from standard software engineering practices for code reuse. * **External Jobs (Option D):** Modifying pipeline configurations via external jobs is operationally complex, prone to error, and bypasses the declarative nature of DLT.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineering team is developing a Delta Live Tables (DLT) pipeline containing several tables that require identical data quality checks. To improve maintainability and reduce redundancy, they want to reuse these data quality rules across all tables. What is the recommended approach for implementing reusable expectations in DLT?
A
Define the data quality rules in a centralized Databricks notebook or Python file and import them as a library within each DLT notebook.
B
Persist the data quality rules in a Delta table outside the pipeline's target schema and retrieve them by passing the schema name as a pipeline parameter.
C
Define global Python variables within one DLT notebook and rely on the execution context to share them across all other notebooks in the pipeline.
D
Implement an external job that programmatically modifies the pipeline's JSON configuration files to inject data quality constraints into the table definitions.