
Explanation:
The correct answer is B. Hopping windows are sliding windows, meaning they can overlap, which makes them suitable for aggregating data over the last 30 seconds every 5 seconds. This provides more frequent updates and a more continuous view of the data. Memorystore is chosen because it provides low-latency access, which is ideal for real-time visualization and analysis. While BigQuery is great for large-scale data analysis, it does not provide the low latency required for this real-time use case.
Ultimate access to all questions.
No comments yet.
You are designing a real-time system for a ride-hailing app, aiming to identify areas with high demand for rides to efficiently reroute available drivers. This system ingests data from multiple sources into Pub/Sub, processes it, and stores the resulting information for real-time dashboards used for visualization and analysis. The data sources include driver location updates every 5 seconds and app-based booking events from riders. The key data processing challenge involves the real-time aggregation of both supply and demand data over the last 30 seconds, updating every 2 seconds. You need to store these aggregated results in a low-latency system for effective visualization. How should you process and store the aggregated data?
A
Group the data by using a tumbling window in a Dataflow pipeline, and write the aggregated data to Memorystore.
B
Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.
C
Group the data by using a session window in a Dataflow pipeline, and write the aggregated data to BigQuery.
D
Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to BigQuery.