
Ultimate access to all questions.
To support the data science team's analysis, a data pipeline is required to transfer time-series transaction data into BigQuery. The dataset, initially 1.5 PB in size and growing by 3 TB daily, consists of thousands of transactions updated hourly with new statuses. This structured data will be utilized for building machine learning models. Which two strategies would best optimize performance and usability for the data science team? (Select two.)
A
Develop a data pipeline where status updates are appended to BigQuery instead of updated.
B
Copy a daily snapshot of transaction data to Cloud Storage and store it as an Avro file. Use BigQuery‘s support for external data sources to query.
C
Use BigQuery UPDATE to further reduce the size of the dataset.
D
Denormalize the data as much as possible.
E
Preserve the structure of the data as much as possible.