
Google Professional Data Engineer
Get started today
Ultimate access to all questions.
You are tasked with designing a data pipeline responsible for publishing application events to a Google Cloud Pub/Sub topic. While the order of these messages is not a priority, it is crucial to aggregate the events across distinct hourly intervals before the data is eventually loaded into BigQuery for further analysis. Considering the necessity to handle potentially large volumes of events and ensure scalability, which technology should you employ for processing and loading the aggregated data into BigQuery?
You are tasked with designing a data pipeline responsible for publishing application events to a Google Cloud Pub/Sub topic. While the order of these messages is not a priority, it is crucial to aggregate the events across distinct hourly intervals before the data is eventually loaded into BigQuery for further analysis. Considering the necessity to handle potentially large volumes of events and ensure scalability, which technology should you employ for processing and loading the aggregated data into BigQuery?
Exam-Like