LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


You're assisting a team of data analysts in setting up a Cloud Dataproc cluster for Spark-based analysis of large datasets. The cluster will handle numerous jobs. Adhering to Google's recommended best practices, which two actions would you prioritize?

Real Exam




Explanation:

Choosing Cloud Storage for persistent storage is advised over HDFS on local disks, ensuring data remains accessible even after cluster shutdown without the need for data migration. Google suggests capping preemptible VMs at 30% for secondary nodes to balance cost and reliability. Autoscaling is beneficial, especially with Cloud Storage, for clusters managing multiple jobs. For more insights, visit Google's Dataproc Best Practices Guide.

Powered ByGPT-5