
Explanation:
The optimal solution is to transfer your data to Cloud Storage and execute your jobs on Dataproc. Cloud Storage offers scalable and durable object storage, making your data readily accessible for Spark jobs on Dataproc. Dataproc, Google Cloud's managed Spark and Hadoop service, facilitates running these jobs without the overhead of infrastructure management, aligning with the goal of minimal code changes and efficient migration.
Ultimate access to all questions.
You are planning to migrate numerous Apache Spark jobs from an on-premises Apache Hadoop cluster to Google Cloud, aiming to use managed services for job execution to avoid managing a persistent Hadoop cluster. With a tight deadline and the objective to keep code changes to a minimum, what is the most efficient approach to achieve this migration?
A
Migrate your data to BigQuery and transform your Spark scripts into SQL-based processing.
B
Re-engineer your jobs using Apache Beam and execute them on Dataflow.
C
Transfer your data to Compute Engine disks and directly manage and execute your jobs on these instances.
D
Transfer your data to Cloud Storage and execute your jobs on Dataproc.
No comments yet.