
Explanation:
The correct answer is B because:
Requirement: The data engineer needs to run a new notebook prior to the original task. This means the new task must execute before the original task.
Task Dependencies in Databricks Jobs: In Databricks Jobs, you can set up task dependencies where one task must complete before another starts. When you add a task as a dependency of another task, the dependent task will only run after the prerequisite task completes successfully.
Option B Analysis:
Why other options are incorrect:
Best Practice: This approach maintains the existing job structure while adding the necessary preprocessing step, ensuring data quality issues are addressed before the main processing task runs.
Ultimate access to all questions.
No comments yet.
A data engineer has a single-task Job that runs each morning before they begin working. After identifying an upstream data issue, they need to set up another task to run a new notebook prior to the original task. Which of the following approaches can the data engineer use to set up the new task?
A
They can clone the existing task in the existing Job and update it to run the new notebook.
B
They can create a new task in the existing Job and then add it as a dependency of the original task.
C
They can create a new task in the existing Job and then add the original task as a dependency of the new task.
D
They can create a new job from scratch and add both tasks to run concurrently.
E
They can clone the existing task to a new Job and then edit it to run the new notebook.