
Answer-first summary for fast verification
Answer: Utilize the Databricks API to automate the export of notebooks as JSON files and store these files in a cloud storage service like Azure Blob Storage, setting up lifecycle management policies to optimize storage costs.
Option B is the best approach because it leverages the Databricks API for automation, ensuring scalability and minimizing manual effort. Storing the backups in a cloud storage service like Azure Blob Storage aligns with the requirement for secure, cloud-based storage and allows for the implementation of lifecycle management policies to control costs. Option A is not scalable due to its manual nature and does not meet the requirement for cloud-based storage. Option C does not constitute a proactive backup strategy and may not cover all data loss scenarios. Option D, while comprehensive, may introduce unnecessary complexity and cost, especially if the primary need is to backup notebooks, and it does not comply with the requirement for cloud-based storage.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In a Databricks workspace, you are tasked with ensuring the regular backup of a large number of notebooks to safeguard against data loss. The solution must be scalable, cost-effective, and minimize manual intervention. Additionally, the backup process should comply with organizational policies that require backups to be stored in a secure, cloud-based storage service for easy access and recovery. Considering these requirements, which of the following approaches is the BEST to perform this backup, and why? (Choose one option.)
A
Manually export each notebook as a JSON file on a weekly basis and store these files in a local secure server, ensuring that each file is encrypted before storage.
B
Utilize the Databricks API to automate the export of notebooks as JSON files and store these files in a cloud storage service like Azure Blob Storage, setting up lifecycle management policies to optimize storage costs.
C
Depend solely on Databricks' built-in data recovery features, assuming that these features provide sufficient protection against all types of data loss scenarios.
D
Implement a third-party backup solution that captures the entire Databricks workspace, including notebooks, clusters, and configurations, storing the backup in an on-premises data center for enhanced security.