
Ultimate access to all questions.
Question 37
A data engineer has a Job with multiple tasks that runs nightly. One of the tasks unexpectedly fails during 10 percent of the runs.
Which of the following actions can the data engineer perform to ensure the Job completes each night while minimizing compute costs?
Explanation:
Option D is the correct answer because:
Targeted retry policy: Instituting a retry policy specifically for the failing task (which only fails 10% of the time) is the most cost-effective solution. This allows the task to automatically retry when it fails, without affecting other tasks that are running successfully.
Cost minimization: Since the task only fails 10% of the time, retrying it a few times when it fails is much more cost-effective than running the entire job multiple times or using more expensive compute resources.
Job completion guarantee: With appropriate retry settings, the task will eventually succeed, ensuring the entire job completes each night.
Why other options are less optimal:
Option A: Retrying the entire job is inefficient and costly, as it would rerun all tasks (including successful ones) when only one task fails.
Option B: While observing the task is good for troubleshooting, it doesn't solve the immediate problem of ensuring job completion each night.
Option C: Running the job multiple times is wasteful and expensive, as it would consume compute resources for duplicate runs.
Option E: Using separate Jobs clusters for each task increases complexity and costs without solving the underlying reliability issue.
The retry policy approach directly addresses the intermittent failure while maintaining cost efficiency.