
Ultimate access to all questions.
A data engineering team is developing a Delta Live Tables (DLT) pipeline comprising multiple tables that require identical data quality checks. To maximize maintainability and reduce redundancy, they need to share these expectations across the entire pipeline.
What is the most efficient and recommended approach to achieve this reuse?
A
Maintain the data quality rules in a Delta table outside the pipeline's target schema and pass the schema name as a pipeline parameter to be queried at runtime.
B
Define the data quality rules in a separate Databricks notebook or Python file and import them as a module/library into each DLT notebook in the pipeline.
C
Utilize global Python variables to share expectation definitions across different DLT notebooks assigned to the same pipeline.
D
Apply data quality constraints to the pipeline's tables using an external job that modifies the pipeline’s underlying configuration files.