Ultimate access to all questions.
You have a BigQuery table and you run a query with a WHERE clause that filters the data based on a timestamp and ID column. However, after using the bq query “-dry_run“ command, you discover that the query triggers a full scan of the table, even though the filter only selects a small fraction of the data. You want to minimize the amount of data scanned by BigQuery while keeping your SQL queries intact. Which approach should you take?
Explanation:
Correct
Option C is the right choice: Partitioned tables are special tables divided into segments, called partitions, that make it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance, and you can control costs by reducing the number of bytes read by a query. Clustered tables in BigQuery are tables that have a user-defined column sort order using clustered columns. Clustered tables can improve query performance and reduce query costs.