
Answer-first summary for fast verification
Answer: Create a narrow table in Bigtable with a row key that combines the Computer Engine computer identifier with the sample time at each second
The correct answer is C. For time-series data, a narrow table in Bigtable where each row key combines the computer identifier with the sample time at each second is most suitable. This schema allows for efficient storage and retrieval of time-series data, supports future growth, and is effective for real-time analytics. Storing one event per row makes it easier to run queries and avoids the risk of exceeding the recommended maximum row size. Bigtable provides low-latency access to large-scale data, which is crucial for real-time analytics, and utilizing this schema pattern ensures efficient and scalable storage.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a data engineer, you are tasked with selecting an appropriate database solution for storing time series data related to CPU and memory usage across millions of computers. This data is collected in one-second interval samples. The database must support real-time, ad hoc analytics performed by analysts. Additionally, you need to ensure that the database and its schema design are cost-efficient, avoiding charges per query executed, and scalable for future dataset growth. Which database and data model would you choose to meet these requirements?
A
Create a table in BigQuery, and append the new samples for CPU and memory to the table
B
Create a wide table in BigQuery, create a column for the sample value at each second, and update the row with the interval for each second
C
Create a narrow table in Bigtable with a row key that combines the Computer Engine computer identifier with the sample time at each second
D
Create a wide table in Bigtable with a row key that combines the computer identifier with the sample time at each minute, and combine the values for each second as column data.
No comments yet.