Google Professional Data Engineer

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

NO.29 You want to optimize your queries for cost and performance. How should you structure your data?

Real Exam

Community

LLeetQuiz

Partition table data by create_date, location_id and device_version

Partition table data by create_date cluster table data by location_id and device_version

Cluster table data by create_date, location_id and device_version

Cluster table data by create_date partition by locationed and device_version

Explanation:

Explanation

Option B is correct because:

Partition by create_date: Date-based partitioning is ideal for time-series data as it allows BigQuery to scan only relevant partitions during queries
Cluster by location_id and device_version: Clustering organizes data within partitions, enabling efficient filtering and sorting
Cost optimization: Partitioning reduces the amount of data scanned, directly lowering query costs
Performance optimization: Clustering improves query performance by organizing data for common filter patterns

Why other options are incorrect:

Option A: You cannot partition by multiple columns in BigQuery - only one partition column is allowed
Option C: Clustering alone doesn't provide the same cost benefits as partitioning for time-based queries
Option D: This has incorrect syntax (locationed instead of location_id) and partitions by non-date columns which is less efficient

This combination of partitioning and clustering follows BigQuery best practices for optimizing both cost and performance.

Powered ByGPT-5.2

Loading comments...