
Answer-first summary for fast verification
Answer: Redesign the schema to denormalize the data with nested and repeated data.
Feedback: A. Incorrect. The normalized form in transactional databases are efficient for writes, but not efficient for running queries against. B. Incorrect. Combining multiple tables into a single table using only outer joins could include significantly more redundancy than is required. Fewer tables are better on BigQuery, but the schema needs to be thought through. C. Incorrect. Normalizing the data is not the recommended approach for BigQuery. D. Correct. Denormalizing the data is the recommended approach. Joining large amounts of data repeatedly during data analysis increases costs. Links: https://cloud.google.com/bigquery/docs/best-practices-performance-overview More information: Courses: Modernizing Data Lakes and Data Warehouses on Google Cloud ● Building a Data Warehouse Skill Badge: Build a Data Warehouse with BigQuery Summary: Analytics oriented databases such as BigQuery have significantly more read requests than write requests. In this scenario, the analytics engine optimizes for query performance at the cost of data redundancy. BigQuery can also use other optimizations like partitioning and clustering to further improve the performance of queries. Diagnostic Question 04 Discussion A. Create an ingestion-time partitioned table with daily partitioning type. B. Create an ingestion-time partitioned table with yearly partitioning type. C. Create an integer-range partitioned table. D. Create a time-unit column-partitioned table with yearly partitioning type. You are ingesting data that is spread out over a wide range of dates into BigQuery at a fast rate. You need to partition the table to make queries performant. What should you do? 3 2. Feedback: A. Correct. A daily partition type is the most suitable given the volume of the data and the range of dates. B. Incorrect. A yearly partition type has too much data per partition, which makes queries inefficient. C. Incorrect. An integer-range partition type is not appropriate given that the data is defined by dates. D. Incorrect. A yearly partition type has too much data per partition, which makes queries inefficient. Links: https://cloud.google.com/bigquery/docs/partitioned-tables#date_timestamp_partitione d_tables https://cloud.google.com/bigquery/docs/creating-partitioned-tables#create_a_time-unit _column-partitioned_table More information: Courses: Modernizing Data Lakes and Data Warehouses on Google Cloud ● Building a Data Warehouse Building Resilient Streaming Analytics Systems on Google Cloud ● Advanced BigQuery Functionality and Performance
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You have several large tables in your transaction databases. You need to move all the data to BigQuery for the business analysts to explore and analyze the data.
A
Retain the data on BigQuery with the same schema as the source.
B
Combine all the transactional database tables into a single table using outer joins.
C
Redesign the schema to normalize the data by removing all redundancies.
D
Redesign the schema to denormalize the data with nested and repeated data.