
Ultimate access to all questions.
A data engineer has a single-task Job that runs each morning before they begin working. After identifying an upstream data issue, they need to set up another task to run a new notebook prior to the original task.
Which approach can the data engineer use to set up the new task?
A
They can clone the existing task in the existing Job and update it to run the new notebook.
B
They can create a new task in the existing Job and then add it as a dependency of the original task.
C
They can create a new task in the existing Job and then add the original task as a dependency of the new task.
D
They can create a new job from scratch and add both tasks to run concurrently.
Explanation:
The correct answer is B because:
Requirement Analysis: The data engineer needs to run a new notebook prior to the original task. This means the new task must execute first, and then the original task should run.
Task Dependencies in Databricks Jobs: In Databricks, you can set up task dependencies where one task must complete before another can start. When you add a task as a dependency of another task, the dependent task will wait for the prerequisite task to complete.
Option B Analysis:
Why Other Options Are Incorrect:
Best Practice: This approach maintains the existing job structure while adding the necessary preprocessing step, which is efficient and maintains operational consistency.