LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


Your analytics team aims to develop a basic statistical model to identify customers who are most likely to re-engage with your company, based on various metrics. They intend to deploy the model on Apache Spark using datasets stored in Google Cloud Storage. You have suggested leveraging Google Cloud Dataproc for executing this task. Preliminary tests indicate that the workload completes in about 30 minutes on a 15-node cluster, with the results being saved to Google BigQuery. The objective is to execute this workload on a weekly basis. How should you optimize the cluster to minimize costs?

Exam-Like



Powered ByGPT-5