
Answer-first summary for fast verification
Answer: Use ephemeral clusters.
A. Incorrect. Because many jobs are high priority, running them in sequence would not meet the business requirements. B. Incorrect. With Dataproc, this is not a recommended approach. Configurations and running jobs could interfere with each other. C. Correct. Jobs can use ephemeral clusters to quickly run the job and then deallocate the resources after use. Multiple jobs can be run in parallel without interfering with each other. D. Incorrect. Cluster autoscaling is effective within a cluster running individual jobs, but it is not recommended when running multiple jobs because they can interfere with the resource scaling. Links: https://cloud.google.com/blog/products/data-analytics/dataproc-job-optimization-how-t o-guide More information: Courses: Building Batch Data Pipelines on Google Cloud ● Executing Spark on Dataproc
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You need to design a Dataproc cluster to run multiple small jobs. Many jobs (but not all) are of high priority.
A
Reuse the same cluster and run each job in sequence.
B
Reuse the same cluster to run all jobs in parallel.
C
Use ephemeral clusters.
D
Use cluster autoscaling.