
Ultimate access to all questions.
A data engineer has a Job with multiple tasks that runs nightly. Each of the tasks runs slowly because the clusters take a long time to start. Which of the following actions can the data engineer perform to improve the startup time for the clusters used for the Job?
A
They can use endpoints available in Databricks SQL
B
They can use job clusters instead of all-purpose clusters
C
They can configure the clusters to be single-node
D
They can use clusters that are from a cluster pool
E
They can configure the clusters to autoscale for larger data sizes
Explanation:
Correct Answers: B and D
Why B is correct:
Why D is correct:
Why other options are incorrect:
A: Databricks SQL endpoints are for SQL analytics workloads, not for improving cluster startup time for general jobs
C: While single-node clusters may start slightly faster than multi-node clusters, this is not the most effective solution and may not meet the computational requirements of the job
E: Autoscaling helps with handling varying data sizes during job execution but does not address cluster startup time
Best Practices: