Ultimate access to all questions.
A data engineer has configured two nightly jobs. The first job begins at 12:00 AM and typically finishes in 30 minutes. The second job, which depends on the first, starts at 12:45 AM. Occasionally, the second job fails if the first hasn't completed by 12:45 AM. What approach can the data engineer adopt to prevent this issue?
Explanation:
A job can consist of a single task or a multi-task workflow with dependencies. Databricks handles task orchestration, cluster management, and error reporting. Tasks can be implemented in various formats and controlled by specifying dependencies, allowing for sequential or parallel execution. In this scenario, the second job's dependency on the first suggests the solution is to use multiple tasks in a single job with a linear dependency, ensuring the second task only starts after the first completes.