Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Explanation:

Explanation

Correct Answer: D - They can institute a retry policy for the task that periodically fails

Why this is correct:

Targeted approach: The problem is specific to one task that fails only 10% of the time. Implementing a retry policy for just that task is the most efficient solution.
Cost-effective: Retrying only the failing task minimizes compute costs compared to retrying the entire job or running multiple job instances.
High success probability: With a 90% success rate, a few retries should ensure completion without excessive resource consumption.
Standard practice: Task-level retry policies are a common pattern for handling intermittent failures in data pipelines.

Why other options are incorrect:

A. They can institute a retry policy for the entire Job

This would retry all tasks, not just the failing one, wasting compute resources on tasks that already succeeded.
Increases costs unnecessarily by re-running successful tasks.

B. They can observe the task as it runs to try and determine why it is failing

While investigation is important for root cause analysis, it doesn't ensure job completion each night.
This is a reactive approach rather than a proactive solution for ensuring completion.

C. They can set up the Job to run multiple times ensuring that at least one will complete

This would significantly increase compute costs by running multiple instances of the entire job.
With a 90% success rate, running multiple instances is wasteful overkill.

E. They can utilize a Jobs cluster for each of the tasks in the Job

This would increase costs by provisioning separate clusters for each task.
Doesn't address the intermittent failure issue - just changes the execution environment.

Best Practice Recommendation: In Databricks Jobs, you can configure task-level retry policies with:

Maximum number of retries (e.g., 3-5)
Exponential backoff between retries
Specific error handling for known failure patterns

This approach ensures job completion while maintaining cost efficiency for intermittent failures.

Explanation:

Explanation

Correct Answer: D - They can institute a retry policy for the task that periodically fails

Why this is correct:

Targeted approach: The problem is specific to one task that fails only 10% of the time. Implementing a retry policy for just that task is the most efficient solution.
Cost-effective: Retrying only the failing task minimizes compute costs compared to retrying the entire job or running multiple job instances.
High success probability: With a 90% success rate, a few retries should ensure completion without excessive resource consumption.
Standard practice: Task-level retry policies are a common pattern for handling intermittent failures in data pipelines.

Why other options are incorrect:

A. They can institute a retry policy for the entire Job

This would retry all tasks, not just the failing one, wasting compute resources on tasks that already succeeded.
Increases costs unnecessarily by re-running successful tasks.

B. They can observe the task as it runs to try and determine why it is failing

While investigation is important for root cause analysis, it doesn't ensure job completion each night.
This is a reactive approach rather than a proactive solution for ensuring completion.

C. They can set up the Job to run multiple times ensuring that at least one will complete

This would significantly increase compute costs by running multiple instances of the entire job.
With a 90% success rate, running multiple instances is wasteful overkill.

E. They can utilize a Jobs cluster for each of the tasks in the Job

This would increase costs by provisioning separate clusters for each task.
Doesn't address the intermittent failure issue - just changes the execution environment.

Best Practice Recommendation: In Databricks Jobs, you can configure task-level retry policies with:

Maximum number of retries (e.g., 3-5)
Exponential backoff between retries
Specific error handling for known failure patterns

This approach ensures job completion while maintaining cost efficiency for intermittent failures.

Comments (0)

No comments yet.

A data engineer has a Job with multiple tasks that runs nightly. One of the tasks unexpectedly fails during 10 percent of the runs.

Which of the following actions can the data engineer perform to ensure the Job completes each night while minimizing compute costs?

Exam-Like

Community

KKeng

Last updated: January 13, 2026 at 09:02

They can institute a retry policy for the entire Job

11.1%

They can observe the task as it runs to try and determine why it is failing

8.9%

Databricks Certified Data Engineer - Associate

Get started today

Explanation

Explanation

Comments (0)

Get started today

Comments (0)

A data engineer has a Job with multiple tasks that runs nightly. One of the tasks unexpectedly fails during 10 percent of the runs. Which of the following actions can the data engineer perform to ensure the Job completes each night while minimizing compute costs?

A data engineer has a Job with multiple tasks that runs nightly. One of the tasks unexpectedly fails during 10 percent of the runs.

Which of the following actions can the data engineer perform to ensure the Job completes each night while minimizing compute costs?