
Answer-first summary for fast verification
Answer: Automate the notebook execution using Azure Databricks jobs, implement robust exception handling with logging, and configure the Delta Lake for dynamic data retention based on data volume.
Using Azure Databricks jobs for automation ensures reliability and scalability. Robust exception handling with logging helps in maintaining the integrity of the data pipeline. Dynamic data retention based on data volume optimizes storage usage and ensures compliance with data governance policies.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with integrating a Jupyter notebook into a data pipeline for a financial services company. The notebook must process daily transaction data, perform anomaly detection, and update a Delta Lake. Describe the steps you would take to ensure the notebook is seamlessly integrated into the pipeline, including how you would handle potential errors and data retention policies.
A
Use Azure Data Factory to schedule the notebook execution, implement basic exception handling within the notebook, and set a fixed data retention period in the Delta Lake.
B
Manually execute the notebook daily, use Python try-except blocks for error handling, and configure the Delta Lake for indefinite data retention.
C
Automate the notebook execution using Azure Databricks jobs, implement robust exception handling with logging, and configure the Delta Lake for dynamic data retention based on data volume.
D
Schedule the notebook using cron jobs, handle exceptions by restarting the notebook, and set a rolling data retention policy in the Delta Lake.
No comments yet.