
Answer-first summary for fast verification
Answer: Use pre-emptible virtual machines (VMs) for the cluster
The correct answer is B: Use pre-emptible virtual machines (VMs) for the cluster. Pre-emptible VMs are cost-effective because they are significantly cheaper than regular VMs, typically offering a 60-91% discount. The nature of the workload described—running Apache Spark jobs on a scheduled weekly basis—makes it suitable for using pre-emptible VMs since they can handle interruptions and help lower compute costs. Options A, C, and D would increase complexity or cost without necessarily providing better cost optimization.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your analytics team aims to develop a basic statistical model to identify customers who are most likely to re-engage with your company, based on various metrics. They intend to deploy the model on Apache Spark using datasets stored in Google Cloud Storage. You have suggested leveraging Google Cloud Dataproc for executing this task. Preliminary tests indicate that the workload completes in about 30 minutes on a 15-node cluster, with the results being saved to Google BigQuery. The objective is to execute this workload on a weekly basis. How should you optimize the cluster to minimize costs?
A
Migrate the workload to Google Cloud Dataflow
B
Use pre-emptible virtual machines (VMs) for the cluster
C
Use a higher-memory node so that the job runs faster
D
Use SSDs on the worker nodes so that the job can run faster
No comments yet.