
Answer-first summary for fast verification
Answer: Use the 'Repair and rerun' feature in the Databricks Jobs UI to execute only the failed tasks and their dependencies.
Databricks Jobs includes a built-in 'Repair and rerun' capability. This feature allows users to rerun only the failed tasks within a job run, preserving the results of successful tasks. This significantly reduces costs and execution time by avoiding redundant processing of data that has already been successfully transformed.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineering team needs to address failures within a multi-task Databricks Jobs workflow. To optimize resource utilization and time, they must repair failed tasks while ensuring minimal recomputation of tasks that have already completed successfully. What is the most efficient approach?
A
Restart the compute cluster and manually trigger a full rerun of the entire workflow.
B
Use the 'Repair and rerun' feature in the Databricks Jobs UI to execute only the failed tasks and their dependencies.
C
Programmatically create a new temporary workflow designed specifically to handle the logic of the failed tasks.
D
Clone the existing job definition and execute the new job from the beginning.