Google Professional Data Engineer

Ultimate access to all questions.

Your company is currently in the process of migrating their 30-node Apache Hadoop cluster to a cloud-based environment. The primary objective is to reuse the Hadoop jobs that have already been developed while minimizing the management overhead associated with maintaining the cluster. Additionally, there is a need to ensure that data can be persisted beyond the lifespan of the cluster itself. What is the best course of action to achieve these goals?

Exam-Like

Create a Google Cloud Dataflow job to process the data.

7.0%

Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS.

13.3%

Loading comments...