Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

In the context of designing a multi-task Databricks job on Microsoft Azure, where each task must only commence upon the successful completion of its predecessor to ensure data integrity and workflow correctness, which of the following approaches best aligns with Azure best practices for managing task dependencies and execution order? Consider factors such as cost-efficiency, scalability, and compliance with Azure's recommended patterns. Choose the best option from the following:

Simulated

Implement a single monolithic task encompassing all operations, thereby eliminating the need for dependency management but potentially increasing resource utilization and reducing scalability.

2.2%

Define multiple tasks within the job, explicitly setting each task to depend on the successful completion of the previous one, and configure the job to execute these tasks sequentially, leveraging Azure's native task dependency features for optimal resource use and scalability.

Comments

Loading comments...

Deploy all tasks to run in parallel, utilizing an external coordination service like Apache ZooKeeper to manage dependencies, which may introduce additional complexity and cost without significant benefits for sequential workflows.

5.6%