
Ultimate access to all questions.
To ensure the highest data quality in your ETL pipelines deployed on Azure Databricks, you decide to implement an advanced testing framework that uses machine learning to predict and identify potential data quality issues. How would you design and integrate this framework into your deployment process?
A
Incorporating an unsupervised learning model within Databricks notebooks that continuously learns from incoming data, flagging anomalies before they reach downstream systems
B
Using Databricks MLflow for model management, automating the deployment of data quality models into production pipelines, and monitoring model performance
C
Leveraging Azure Machine Learning to periodically retrain data quality models, deploying these models as web services called by Databricks jobs for real-time quality checking
D
Training a machine learning model on historical data quality issues and integrating it into the CI/CD pipeline to evaluate new data batches pre-deployment