
Answer-first summary for fast verification
Answer: Dataproc
**Correct Option: D. Dataproc** Dataproc is Google Cloud's fully managed service for Apache Spark and Hadoop, optimized for large-scale batch processing and real-time analytics. It offers: - **Fully managed infrastructure**, eliminating the need for manual setup. - **Scalability**, allowing clusters to adjust based on the dataset size. - **Integration with GCP services**, such as BigQuery and Cloud Storage, for seamless data workflows. - **Apache Spark support**, enabling efficient data processing and machine learning tasks. **Why other options are not correct:** - **A. BigQuery**: A serverless data warehouse focused on querying and analyzing data, not designed for processing pipelines. - **B. Cloud Functions**: A serverless platform for connecting cloud services, not suited for large-scale data processing. - **C. Dataflow**: While it supports data processing pipelines, it's not specialized for large-scale batch processing like Dataproc.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with designing a data processing pipeline for a financial services company that requires processing large-scale datasets with Apache Spark for real-time analytics. The solution must be cost-effective, scalable, and fully managed to minimize operational overhead. Additionally, the company emphasizes the importance of seamless integration with other Google Cloud services for data storage and analytics. Which Google Cloud service should you choose to best meet these requirements? (Choose one correct option)
A
BigQuery
B
Cloud Functions
C
Dataflow
D
Dataproc
No comments yet.