Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Explanation:

Explanation

The correct answer is A. They can utilize multiple tasks in a single job with a linear dependency.

Task Dependencies: In Databricks Jobs, you can create multiple tasks within a single job and define dependencies between them. By using a linear dependency (Task 2 depends on Task 1), you ensure that Task 2 only starts after Task 1 has completed successfully.
Eliminates Timing Issues: This approach removes the hard-coded time dependency (12:30 AM start time) and replaces it with a logical dependency based on task completion.
Built-in Job Orchestration: Databricks Jobs provide native support for task dependencies, making this the most reliable and maintainable solution.

B. Cluster pools: While cluster pools can improve efficiency by reducing cluster startup time, they don't solve the fundamental dependency/timing issue.
C. Retry policy: A retry policy helps with transient failures but doesn't address the core problem of the second job starting before the first job completes.
D. Limit output size: This doesn't address the dependency issue; it only potentially reduces the chance of failure due to resource constraints.
E. Streaming data: Streaming is for real-time data processing, not for batch job dependencies. This would fundamentally change the architecture and isn't appropriate for nightly batch jobs.

In Databricks, when you have jobs with dependencies, it's best to:

Explanation:

The correct answer is A. They can utilize multiple tasks in a single job with a linear dependency.

Task Dependencies: In Databricks Jobs, you can create multiple tasks within a single job and define dependencies between them. By using a linear dependency (Task 2 depends on Task 1), you ensure that Task 2 only starts after Task 1 has completed successfully.
Eliminates Timing Issues: This approach removes the hard-coded time dependency (12:30 AM start time) and replaces it with a logical dependency based on task completion.
Built-in Job Orchestration: Databricks Jobs provide native support for task dependencies, making this the most reliable and maintainable solution.

B. Cluster pools: While cluster pools can improve efficiency by reducing cluster startup time, they don't solve the fundamental dependency/timing issue.
C. Retry policy: A retry policy helps with transient failures but doesn't address the core problem of the second job starting before the first job completes.
D. Limit output size: This doesn't address the dependency issue; it only potentially reduces the chance of failure due to resource constraints.
E. Streaming data: Streaming is for real-time data processing, not for batch job dependencies. This would fundamentally change the architecture and isn't appropriate for nightly batch jobs.

In Databricks, when you have jobs with dependencies, it's best to:

No comments yet.

Exam-Like

Community

KKeng

Last updated: February 21, 2026 at 14:03

They can utilize multiple tasks in a single job with a linear dependency

67.8%

They can use cluster pools to help the Jobs run more efficiently

8.1%