
Answer-first summary for fast verification
Answer: Create a multi-task job with separate tasks for each step, configure task dependencies to enforce the sequence, and use shared storage for data exchange between tasks.
Designing a multi-task job in Databricks allows for modular and scalable workflows. Configuring task dependencies ensures that each step is executed in the correct order, and using shared storage facilitates efficient data exchange.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with creating a Databricks job that involves data extraction from multiple sources, transformation, and loading into a data warehouse. Each step has dependencies on the previous steps. Describe how you would design this job to ensure efficient and reliable execution.
A
Create a single notebook that includes all steps, execute the notebook as a job, and handle dependencies within the notebook code.
B
Create a multi-task job with separate tasks for each step, configure task dependencies to enforce the sequence, and use shared storage for data exchange between tasks.
C
Schedule each step as a separate job, manually trigger each job based on the completion of the previous step.
D
Combine all steps into a single Python script, upload the script to Databricks, and run it as a job.
No comments yet.