
Answer-first summary for fast verification
Answer: Opt for a shared job cluster for all tasks in the job
The correct approach involves using a shared job cluster for all tasks in the job. This method leverages cluster reuse, allowing multiple tasks to share the same cluster. The shared cluster initializes when the first task starts and remains active until the last task using it completes. This strategy eliminates the need for multiple cluster startups, reducing both the time and cost associated with cluster initialization and underutilization. It maintains the flexibility of fine-grained configuration while improving cluster utilization, especially with parallel tasks. For more details, refer to: [Cluster Reuse | Databricks](https://docs.databricks.com/clusters/cluster-reuse.html).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A team of data engineers has developed a Databricks Job in production with multiple tasks. Their goal is to minimize costs related to compute resources and optimize the execution time for each task. What is the best strategy to achieve these objectives?
A
Utilize a single all-purpose cluster for all tasks in the job
B
Employ a new all-purpose cluster for each task in the job
C
Opt for a shared job cluster for all tasks in the job
D
Use a new job cluster for each task in the job
E
Combine a few all-purpose clusters and a few job clusters for different tasks in the job
No comments yet.