
Answer-first summary for fast verification
Answer: Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector.
The correct answer is D. Creating a Cloud Dataproc cluster that uses the Google Cloud Storage connector is the best option for migrating the 30-node Apache Hadoop cluster to the cloud. Google Cloud Dataproc is a managed Hadoop and Spark service that minimizes management overhead while allowing you to reuse your existing Hadoop jobs. By using the Google Cloud Storage connector, you can persist data in Google Cloud Storage, which ensures data durability beyond the life of the cluster. This approach leverages the scalability and cost-effectiveness of Google Cloud Storage compared to using HDFS on persistent disks or local SSDs.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your company is currently in the process of migrating their 30-node Apache Hadoop cluster to a cloud-based environment. The primary objective is to reuse the Hadoop jobs that have already been developed while minimizing the management overhead associated with maintaining the cluster. Additionally, there is a need to ensure that data can be persisted beyond the lifespan of the cluster itself. What is the best course of action to achieve these goals?
A
Create a Google Cloud Dataflow job to process the data.
B
Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS.
C
Create a Hadoop cluster on Google Compute Engine that uses persistent disks.
D
Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector.
E
Create a Hadoop cluster on Google Compute Engine that uses Local SSD disks.
No comments yet.