
Answer-first summary for fast verification
Answer: Dataflow
**Correct Option: C. Dataflow** Dataflow is a fully managed service designed for executing data processing pipelines, making it ideal for ETL tasks. It offers a unified programming model for both batch and stream processing, facilitating the construction and management of complex data pipelines. **Key features of Dataflow for ETL:** - **Scalability:** Automatically adjusts to manage large datasets and intricate transformations. - **Fault tolerance:** Guarantees data reliability and durability. - **Integration with other GCP services:** Works seamlessly with services like BigQuery, Cloud Storage, and Pub/Sub. - **Unified programming model:** Allows the use of the same model for both batch and stream processing. **Why other options are incorrect:** - **A. BigQuery:** A serverless data warehouse intended for querying and analyzing large datasets, not for ETL pipelines. - **B. Cloud Spanner:** A fully managed relational database service. - **D. Cloud Functions:** A serverless computing platform for building and connecting cloud services, not suited for ETL pipelines.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In the context of designing a scalable and cost-effective data processing system on Google Cloud Platform (GCP), you are tasked with selecting the most appropriate service for constructing an ETL (Extract, Transform, Load) pipeline that handles both batch and stream processing. The solution must ensure high fault tolerance, seamless integration with other GCP services, and the ability to automatically scale based on the workload. Considering these requirements, which GCP service should you choose? (Choose one correct option)
A
BigQuery
B
Cloud Spanner
C
Dataflow
D
Cloud Functions
No comments yet.