Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a complex data pipeline with multiple interdependent tasks scheduled via Databricks, which approach is most efficient for managing these dependencies to minimize idle time and resource wastage?
A
Implementing a custom dependency resolution framework within Databricks notebooks.
B
Utilizing external orchestration tools like Apache Airflow to define and manage task dependencies outside of Databricks.
C
Manual oversight of task executions to trigger dependent tasks upon completions.
D
Relying solely on Databricks‘ built-in job scheduling features to manage all dependencies.