
Answer-first summary for fast verification
Answer: Use the query result reuse feature of Amazon Athena for the SQL queries.
Amazon Athena query result reuse allows subsequent identical queries to fetch data from previous query results without rescanning the underlying data, as long as the data hasn't changed. Since the dataset is updated only once a day and the BI app refreshes every hour, the query result reuse feature will significantly reduce the cost by not scanning the petabyte-scale data every hour. This adds no new infrastructure (unlike ElastiCache) and has the least operational overhead.
Author: Ritesh Yadav
Ultimate access to all questions.
Question 18 A financial company wants to use Amazon Athena to run on-demand SQL queries on a petabyte-scale dataset to support a business intelligence (BI) application. An AWS Glue job that runs during non-business hours updates the dataset once every day. The BI application has a standard data refresh frequency of 1 hour to comply with company policies. A data engineer wants to cost optimize the company's use of Amazon Athena without adding any additional infrastructure costs. Which solution will meet these requirements with the LEAST operational overhead?
A
Configure an Amazon S3 Lifecycle policy to move data to the S3 Glacier Deep Archive storage class after 1 day.
B
Use the query result reuse feature of Amazon Athena for the SQL queries.
C
Add an Amazon ElastiCache cluster between the BI application and Athena.
D
Change the format of the files that are in the dataset to Apache Parquet.
No comments yet.