
Answer-first summary for fast verification
Answer: Databricks Jobs
**Correct Answer: C. Databricks Jobs.** **Why?** - **Designed for Scheduling:** Databricks Jobs is specifically designed to schedule Spark workloads, including notebooks. It allows you to define execution times, schedules (using cron syntax), and dependencies, making it ideal for recurring tasks. - **Integration and Management:** Jobs seamlessly integrate with Databricks notebooks. You can schedule a notebook directly through the UI or programmatically using the Jobs API in Python. It also provides convenient monitoring and management of scheduled runs. - **Scalability and Reliability:** Jobs can manage complex workflows and utilize Databricks clusters for distributed execution, ensuring your time-sensitive project runs reliably and at scale. **Other Options Analyzed:** - **A. Databricks REST API:** While useful for programmatic interactions, it's not specifically designed for scheduling notebooks and would require additional manual scripting and monitoring. - **B. MLlib CrossValidator:** This is a tool for model evaluation, not workflow automation, making it unsuitable for scheduling notebook execution. - **D. Databricks Delta:** Delta is a storage format and does not offer scheduling capabilities. It might store notebook data but cannot trigger execution. **Conclusion:** Databricks Jobs is the most efficient and appropriate solution for automating notebook execution at specific intervals in Databricks, especially for time-sensitive projects. Its scheduling features, integration, and reliability make it the perfect choice.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data scientist is working on a time-sensitive machine learning project in Databricks and needs to schedule the execution of a notebook at specific intervals. Which Databricks feature is best suited for automating this task?
A
Databricks REST API
B
MLlib CrossValidator
C
Databricks Jobs
D
Databricks Delta