
Explanation:
Keeping the S3 bucket in the same AWS Region as Athena eliminates cross-region data transfer latency and costs, directly improving query performance. Converting CSV to Apache Parquet transforms the data into a columnar format that supports predicate pushdown — Athena fetches only the columns and row groups needed, dramatically reducing the data scanned and improving query speed and cost efficiency.
Ultimate access to all questions.
No comments yet.
An airline company is collecting metrics about flight activities for analytics. The company is conducting a proof of concept (POC) test to show how analytics can provide insights that the company can use to increase on-time departures. The POC test uses objects in Amazon S3 that contain the metrics in .csv format. The POC test uses Amazon Athena to query the data. The data is partitioned in the S3 bucket by date. As the amount of data increases, the company wants to optimize the storage solution to improve query performance. Which combination of solutions will meet these requirements? (Choose two.)
A
Add a randomized string to the beginning of the keys in Amazon S3 to get more throughput across partitions.
B
Use an S3 bucket that is in the same account that uses Athena to query the data.
C
Use an S3 bucket that is in the same AWS Region where the company runs Athena queries.
D
Preprocess the .csv data to JSON format by fetching only the document keys that the query requires.
E
Preprocess the .csv data to Apache Parquet format by fetching only the data blocks that are needed for predicates.