
Ultimate access to all questions.
Question 4
Which of the following describes a scenario in which a data engineer will want to use a Job cluster instead of an all-purpose cluster?
Explanation:
Job clusters are designed for automated, scheduled workflows and production jobs, while all-purpose clusters are intended for interactive development and collaboration.
Let's analyze each option:
A. An ad-hoc analytics report needs to be developed while minimizing compute costs. ❌ This describes interactive development work, which is better suited for all-purpose clusters. Job clusters are not ideal for ad-hoc work.
B. A data team needs to collaborate on the development of a machine learning model. ❌ This is collaborative development work, which requires an all-purpose cluster for interactive notebook development and team collaboration.
C. An automated workflow needs to be run every 30 minutes. ✅ This is the perfect use case for Job clusters. Automated, scheduled workflows benefit from Job clusters because they:
D. A Databricks SQL query needs to be scheduled for upward reporting. ✅ This is also a good use case for Job clusters. Scheduled SQL queries for reporting purposes fit the Job cluster model well.
E. A data engineer needs to manually investigate a production error. ❌ This requires interactive investigation and debugging, which is better suited for all-purpose clusters where the engineer can run commands interactively.
Key differences between Job clusters and All-purpose clusters:
The primary correct answer is C because it clearly describes an automated, scheduled workflow that runs frequently, which is the ideal scenario for Job clusters.