
Answer-first summary for fast verification
Answer: Create a Directed Acyclic Graph in Cloud Composer
The best option for automating your scheduled Spark jobs on Cloud Dataproc, considering sequential and concurrent execution, is to create a Directed Acyclic Graph (DAG) in Cloud Composer. Cloud Composer excels at orchestrating complex workflows with dependencies, making it ideal for managing sequential and concurrent execution of your Spark jobs. It allows you to define dependencies between tasks to ensure certain jobs only run after others finish. Additionally, Cloud Composer integrates seamlessly with Cloud Dataproc, supports automatic scheduling based on time intervals or data availability, and scales well to manage complex data pipelines.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are managing a series of Spark jobs that execute on a Cloud Dataproc cluster based on a defined schedule. These jobs have dependencies; some must execute in a specific sequence, while others can run concurrently. Your task is to automate the execution and orchestration of these jobs. How should you proceed to achieve this automation effectively?
A
Create a Cloud Dataproc Workflow Template
B
Create an initialization action to execute the jobs
C
Create a Directed Acyclic Graph in Cloud Composer
D
Create a Bash script that uses the Cloud SDK to create a cluster, execute jobs, and then tear down the cluster
No comments yet.