LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Google Professional Machine Learning Engineer

Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.


You are tasked with designing a data processing solution for a financial services company that requires processing large-scale transaction data in real-time and batch modes using Apache Spark. The solution must be cost-effective, scalable, and fully managed to minimize operational overhead. Additionally, it should seamlessly integrate with other Google Cloud services for analytics and storage. Which Google Cloud service is the BEST choice for this scenario, and why? Choose the most appropriate option.

Real Exam



Explanation:

Correct Option: C. Dataproc
Dataproc is the best choice for this scenario because it is Google Cloud's fully managed service specifically designed for Apache Spark and Hadoop, making it ideal for large-scale data processing jobs. It offers dynamic scaling to handle varying workloads, cost controls to manage expenses effectively, and seamless integration with other Google Cloud services like BigQuery and Cloud Storage for analytics and storage. These features align perfectly with the requirements of processing large-scale transaction data in real-time and batch modes while minimizing operational overhead.

Why other options are not correct:

  • A. Cloud Composer: While useful for workflow orchestration, it is not designed for direct data processing with Apache Spark.
  • B. BigQuery: Although powerful for analytics and querying, it does not support executing Apache Spark jobs directly.
  • D. Cloud Dataflow: While it supports both stream and batch processing, it is based on Apache Beam and is less optimal for large-scale batch processing compared to Dataproc.
Powered ByGPT-5