
Ultimate access to all questions.
A data engineer has a single-task Job that runs each morning before they begin working. After identifying an upstream data issue, they need to set up another task to run a new notebook prior to the original task. Which of the following approaches can the data engineer use to set up the new task?
A
They can clone the existing task in the existing Job and update it to run the new notebook.
B
They can create a new task in the existing Job and then add it as a dependency of the original task.
C
They can create a new task in the existing Job and then add the original task as a dependency of the new task.
D
They can create a new job from scratch and add both tasks to run concurrently.
E
They can clone the existing task to a new Job and then edit it to run the new notebook.
Explanation:
The correct answer is B because:
Requirement: The data engineer needs to run a new notebook prior to the original task. This means the new task should execute first, followed by the original task.
Task Dependencies in Databricks Jobs: In Databricks, you can set up task dependencies where one task must complete before another starts. When you add a task as a dependency of another task, the dependent task will wait for the upstream task to complete.
Option B Analysis: "They can create a new task in the existing Job and then add it as a dependency of the original task." This means:
Why other options are incorrect:
Best Practice: Adding tasks with dependencies in the same job maintains the existing scheduling configuration and ensures proper execution order.