
Answer-first summary for fast verification
Answer: Create a Standard (1 master, 3 workers) Dataproc cluster, and run a Vertex AI Workbench notebook instance on it.
The correct answer is C. Creating a Standard Dataproc cluster is the most suitable option for running Apache Spark workloads on Google Cloud, as it is specifically designed for such tasks. It minimizes the effort required to set up the environment by providing a managed Spark environment, and it is more cost-effective than setting up a dedicated VM or Kubernetes cluster for a single proof of concept. Additionally, the integration with Vertex AI Workbench allows for leveraging familiar tools and libraries, making the migration process smoother.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You work for a startup that handles various data science workloads using on-premises infrastructure. These workloads are predominantly in PySpark. Your team is planning to transition these data science workloads to Google Cloud. For this, you need to develop a proof of concept (PoC) to migrate one specific data science job to Google Cloud. The PoC should involve minimal cost and effort. What is the first step you should take?
A
Create a n2-standard-4 VM instance and install Java, Scala, and Apache Spark dependencies on it.
B
Create a Google Kubernetes Engine cluster with a basic node pool configuration, install Java, Scala, and Apache Spark dependencies on it.
C
Create a Standard (1 master, 3 workers) Dataproc cluster, and run a Vertex AI Workbench notebook instance on it.
D
Create a Vertex AI Workbench notebook with instance type n2-standard-4.
No comments yet.