
Ultimate access to all questions.
A company is collecting a significant volume of data from a fleet of IoT devices. This data is stored in Optimized Row Columnar (ORC) format within the Hadoop Distributed File System (HDFS) on a persistent Amazon EMR cluster. The company's data analytics team utilizes SQL queries through Apache Presto, which is deployed on the same EMR cluster. These queries involve scanning extensive datasets, have a runtime of less than 15 minutes, and are executed exclusively between 5 PM and 10 PM. Given the concern over the high costs associated with this current setup, a solutions architect is tasked with identifying the most cost-effective solution for facilitating SQL data queries. Which of the following solutions would meet these requirements?
A
Store data in Amazon S3 and use Amazon Redshift Spectrum for querying.
B
Store data in Amazon S3 and utilize the AWS Glue Data Catalog with Amazon Athena for querying.
C
Store data in the EMR File System (EMRFS) and employ Presto within Amazon EMR for querying.
D
Store data in Amazon Redshift and use Amazon Redshift for querying.