Ultimate access to all questions.
A company is collecting a large amount of data from a fleet of loT devices.Data is stored as Optimized ROW Columnar (ORC)files in the Hadoop Distributed File System(HDFS)on a persistent Amazon EMR cluster.The company's data analytics tean queries the data by using SQL.in APache Presto deployed on the same EMR cluster.Queries scan large amounts of data ,always run for less than 15 minutes ,and run only between 5 PM and 10 PM.The company is concerned about the high cost associated with the current solution.A solution architect must propose the most cost- effective solution that will allow SQL data queries.Which solution will meet these requirements ? B.Store data in Amazon S3.Use the AWS Glue Data Catalog and Amazon Athena to query data. C.Store data in EMR File System(EMRFS).Use Presto in Amazon EMR to query data D.Store data in Amazon Redshift.Use Amazon Redshift to query data. Answer B Analyze: You need Redshift cluster to run Redshift spectrum which is expensive.Cost per query is same in Redshift spectrum vs Athena though Athena is not for complex or parallel queries with large data set Q184. A company needs to store and process image data that will be uploaded from mobile devices using a custom mobile app.Usage peaks between 8 AM and 5 PM on weekdays ,with thousands of uploads per minute.The app is rarely used at any other time A user is notified when image processing is complete When combination of actions should a solutions architect take to ensure image processing can scale to handle the load ?