
Answer-first summary for fast verification
Answer: Migrate the workloads to Dataproc plus Cloud Storage; modernize later.
Option B is the correct answer. The rationale behind this choice includes the fact that migrating to Dataproc plus Cloud Storage provides immediate cost savings and flexibility. Dataproc is a managed service suitable for running Apache Spark and Hadoop jobs, making it a natural fit for the existing workloads. By using Cloud Storage instead of HDFS, the company can avoid the complexities and costs of managing on-premises HDFS clusters, especially under variable workloads. Additionally, this approach allows the migration to be completed within the 2-month time frame, with further modernizations to serverless options (such as Dataflow and BigQuery) being possible at a later stage without time pressure.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your company operates a sizable on-premises cluster utilizing Spark, Hive, and HDFS within a colocation facility. This cluster is built to handle peak system usage, although many tasks are processed in batch mode, leading to significant fluctuations in cluster usage. The company is interested in transitioning to the cloud to minimize the overhead tied to maintaining on-premises infrastructure and to achieve cost savings. Additionally, there's an interest in modernizing the infrastructure with more serverless solutions to leverage cloud advantages. Considering the timing for contract renewal with the colocation facility, the company has a strict timeline of 2 months for the initial migration. What strategy would you recommend for this migration to both maximize cost efficiency in the cloud and complete the transition within the allocated timeframe?
A
Migrate the workloads to Dataproc plus HDFS; modernize later.
B
Migrate the workloads to Dataproc plus Cloud Storage; modernize later.
C
Migrate the Spark workload to Dataproc plus HDFS, and modernize the Hive workload for BigQuery.
D
Modernize the Spark workload for Dataflow and the Hive workload for BigQuery.
No comments yet.