
Answer-first summary for fast verification
Answer: Modify the VPC flow log settings to store logs in Apache Parquet format and organize them into hourly partitions.
The correct answer is C. Using Apache Parquet format can significantly enhance query performance in Amazon Athena due to its columnar storage format, which allows for more efficient data scanning and compression. Additionally, partitioning the data by hour will allow Athena to skip irrelevant partitions during queries, further improving query performance. Options A and B do not directly address query performance in Athena, and although option D suggests an Athena engine upgrade, it does not tackle the fundamental issues of data format and partitioning that significantly affect performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A company operates multiple AWS accounts, each with a VPC configured to publish VPC flow logs in text format to a centralized Amazon S3 bucket. These logs are compressed using gzip and must be retained indefinitely. A security engineer periodically uses Amazon Athena to analyze these logs, but performance has declined due to the increasing volume of logs. A solutions architect is tasked with enhancing the log analysis performance and minimizing storage usage. Which solution offers the most significant performance improvement?
A
Develop an AWS Lambda function to decompress gzip files and recompress them with bzip2, then set up an S3 event notification to trigger this Lambda function upon file creation.
B
Activate S3 Transfer Acceleration for the S3 bucket and implement an S3 Lifecycle policy to transition files to the S3 Intelligent-Tiering storage class immediately after upload.
C
Modify the VPC flow log settings to store logs in Apache Parquet format and organize them into hourly partitions.
D
Establish a new Athena workgroup without data usage limits and utilize Athena engine version 2 for querying.
No comments yet.