
Answer-first summary for fast verification
Answer: Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.
The correct answer is B. Hopping windows are sliding windows, meaning they can overlap, which makes them suitable for aggregating data over the last 30 seconds every 5 seconds. This provides more frequent updates and a more continuous view of the data. Memorystore is chosen because it provides low-latency access, which is ideal for real-time visualization and analysis. While BigQuery is great for large-scale data analysis, it does not provide the low latency required for this real-time use case.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a real-time system for a ride-hailing app, aiming to identify areas with high demand for rides to efficiently reroute available drivers. This system ingests data from multiple sources into Pub/Sub, processes it, and stores the resulting information for real-time dashboards used for visualization and analysis. The data sources include driver location updates every 5 seconds and app-based booking events from riders. The key data processing challenge involves the real-time aggregation of both supply and demand data over the last 30 seconds, updating every 2 seconds. You need to store these aggregated results in a low-latency system for effective visualization. How should you process and store the aggregated data?
A
Group the data by using a tumbling window in a Dataflow pipeline, and write the aggregated data to Memorystore.
B
Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.
C
Group the data by using a session window in a Dataflow pipeline, and write the aggregated data to BigQuery.
D
Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to BigQuery.
No comments yet.