Google Professional Data Engineer

Get started today

Ultimate access to all questions.

You are currently managing an on-premises Apache Hadoop deployment and are considering migrating it to the cloud. Your primary objectives are to achieve high fault tolerance and cost efficiency, particularly for long-running batch jobs. Furthermore, you prefer to utilize a managed service to simplify the migration and operational processes. What steps should you take to accomplish this?

Exam-Like

Deploy a Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://

61.0%

Comments

Loading comments...

Deploy a Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://

Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances. Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://

9.8%

Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://

7.3%