
Ultimate access to all questions.
Answer-first summary for fast verification
Answer: Create a streaming Dataflow job to continually read from the Pub/Sub topic and perform the necessary aggregations using tumbling windows
**Explanation:** Option A is the correct answer because: - **Streaming Dataflow** is designed for real-time data processing at scale - **Tumbling windows** provide exactly hourly aggregations with clear boundaries - **Automatic scaling** handles large volumes of events efficiently - **Continuous processing** ensures timely aggregation without delays - **Native integration** with Pub/Sub and BigQuery Other options have limitations: - Option B: Batch processing introduces latency and may miss real-time events - Option C: Cloud Functions have execution time limits and are not designed for large-scale data processing - Option D: Cloud Functions triggered per message would be inefficient for aggregation and could hit rate limits
Author: LeetQuiz .
No comments yet.
NO.12 You are designing a pipeline that publishes application events to a Pub/Sub topic. You need to aggregate events across hourly intervals before loading the results to BigQuery for analysis. Your solution must be scalable so it can process and load large volumes of events to BigQuery. What should you do?
A
Create a streaming Dataflow job to continually read from the Pub/Sub topic and perform the necessary aggregations using tumbling windows
B
Schedule a batch Dataflow job to run hourly, pulling all available messages from the Pub/Sub topic and performing the necessary aggregations
C
Schedule a Cloud Function to run hourly, pulling all alertable messages from the Pub/Sub topic and performing the necessary aggregations
D
Create a Cloud Function to perform the necessary data processing that executes using the Pub/Sub trigger every time a new message is published to the topic.