
Answer-first summary for fast verification
Answer: Partition table data by create_date, cluster table data by location_id, and device_version.
The optimal way to structure your data to maximize performance and cost efficiency is to partition table data by create_date and cluster the table data by location_id and device_version. Partitioning by create_date allows BigQuery to prune non-relevant partitions when querying recent data, while clustering by location_id and device_version keeps related data close together. This reduces scan time and enhances the overall performance when filtering by these columns.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a data engineer tasked with managing and analyzing vast quantities of IoT sensor data collected from millions of devices worldwide, you are responsible for storing this data in BigQuery. The typical queries you will be running are primarily focused on accessing the most recent data, and these queries are filtered by location_id and device_version. In order to optimize these queries for both cost efficiency and performance, how should you structure your data in BigQuery?
A
Partition table data by create_date, location_id, and device_version.
B
Partition table data by create_date, cluster table data by location_id, and device_version.
C
Cluster table data by create_date, location_id, and device_version.
D
Cluster table data by create_date, partition by location_id, and device_version.
No comments yet.