
Answer-first summary for fast verification
Answer: Define the data quality rules in a separate Databricks notebook or Python file and import them as a module/library into each DLT notebook in the pipeline.
### Explanation **Option B is correct.** Delta Live Tables supports code modularity by allowing you to define quality rules (expectations) in a shared Python module or a separate Databricks notebook. By importing these definitions as a library into your DLT notebooks, you can: * **Centralize Logic:** Manage all validation logic in a single location. * **Improve Maintainability:** Updates to a rule in the library automatically propagate to all tables referencing it. * **Reduce Boilerplate:** Eliminate the need to copy-paste `@dlt.expect` or `CONSTRAINT` definitions across multiple files. Databricks documentation recommends using standard Python import patterns to load these shared expectations, ensuring that data quality logic is cleanly separated from the pipeline's transformation logic. ### Why other options are incorrect: * **A:** While you can store rules in a Delta table, it requires custom logic to parse the table at runtime and apply it to the DLT decorator, adding unnecessary complexity compared to native code imports. * **C:** Databricks notebooks operate in isolated scopes; global variables do not reliably persist or share state across different notebooks within a DLT cluster execution. * **D:** Using external jobs to modify pipeline configurations is operationally complex and decouples the enforcement of rules from the DLT engine’s native monitoring and lineage tracking.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineering team is developing a Delta Live Tables (DLT) pipeline comprising multiple tables that require identical data quality checks. To maximize maintainability and reduce redundancy, they need to share these expectations across the entire pipeline.
What is the most efficient and recommended approach to achieve this reuse?
A
Maintain the data quality rules in a Delta table outside the pipeline's target schema and pass the schema name as a pipeline parameter to be queried at runtime.
B
Define the data quality rules in a separate Databricks notebook or Python file and import them as a module/library into each DLT notebook in the pipeline.
C
Utilize global Python variables to share expectation definitions across different DLT notebooks assigned to the same pipeline.
D
Apply data quality constraints to the pipeline's tables using an external job that modifies the pipeline’s underlying configuration files.