
Answer-first summary for fast verification
Answer: Use Amazon Athena directly with Amazon S3 to run the queries as needed.
## Explanation **Correct Answer: C - Use Amazon Athena directly with Amazon S3 to run the queries as needed.** ### Why Amazon Athena is the best choice: 1. **Minimal operational overhead**: Amazon Athena is a serverless query service, meaning there's no infrastructure to provision or manage. You only pay for the queries you run. 2. **Direct S3 integration**: Since the logs are already stored in JSON format in S3, Athena can query them directly without needing to move or transform the data. 3. **On-demand querying**: Athena is perfect for ad-hoc, on-demand queries against data in S3. 4. **Simple queries**: Athena supports standard SQL and is well-suited for simple queries against structured/semi-structured data like JSON. 5. **Minimal architectural changes**: The logs remain in S3, and Athena queries them directly, requiring minimal changes to the existing architecture. ### Why the other options are not optimal: **A. Amazon Redshift**: - Requires loading data from S3 into Redshift (data movement) - Involves provisioning and managing a Redshift cluster - Higher operational overhead and cost for simple, on-demand queries **B. Amazon CloudWatch Logs**: - Would require moving logs from S3 to CloudWatch Logs - CloudWatch Logs Insights has query capabilities but is optimized for log analytics, not general JSON analysis - Additional data transfer and storage costs **D. AWS Glue + Amazon EMR**: - Overly complex for simple, on-demand queries - Requires managing EMR clusters (even transient ones) - Higher operational overhead with Glue catalog and EMR cluster management - Better suited for complex ETL jobs, not simple on-demand queries ### Key AWS Services Comparison: - **Amazon Athena**: Serverless, SQL queries directly on S3 data, pay-per-query - **Amazon Redshift**: Data warehouse, requires data loading, cluster management - **AWS Glue**: ETL service, data cataloging - **Amazon EMR**: Big data processing framework, requires cluster management Given the requirements (logs already in S3, simple on-demand queries, minimal operational overhead), Amazon Athena is clearly the most appropriate solution.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A company needs the ability to analyze the log files of its proprietary application. The logs are stored in JSON format in an Amazon S3 bucket. Queries will be simple and will run on-demand. A solutions architect needs to perform the analysis with minimal changes to the existing architecture.
What should the solutions architect do to meet these requirements with the LEAST amount of operational overhead?
A
Use Amazon Redshift to load all the content into one place and run the SQL queries as needed.
B
Use Amazon CloudWatch Logs to store the logs. Run SQL queries as needed from the Amazon CloudWatch console.
C
Use Amazon Athena directly with Amazon S3 to run the queries as needed.
D
Use AWS Glue to catalog the logs. Use a transient Apache Spark cluster on Amazon EMR to run the SQL queries as needed.