
Answer-first summary for fast verification
Answer: Use Amazon Kinesis Data Streams to collect the inbound sensor data, analyze the data with Kinesis clients, and save the results to an Amazon Redshift cluster using Amazon EMR.
Amazon Kinesis Data Streams is well-suited for the use case described due to its capability to ingest, process, and analyze large volumes of streaming data in near real-time. Here’s why option B is the correct choice: 1. **Near-Real-Time Analytics**: Amazon Kinesis Data Streams can efficiently handle the continuous influx of 8 KB of genomic data per second. It can process this incoming data stream in near real-time, which is essential for providing timely analytics to researchers. 2. **Data Flexibility and Parallelism**: Using Kinesis clients to analyze the data ensures that the solution is flexible and capable of parallel processing. Kinesis supports multiple consumers, allowing for simultaneous and independent processing tasks, ensuring that data can be handled concurrently and efficiently. 3. **Durability and Scalability**: Kinesis Data Streams are designed to be highly durable and scalable. They can automatically scale to accommodate variable data rates and ensure that data is preserved securely until it's processed and transferred to the final destination. 4. **Data Warehousing Needs**: To store the processed data and enable complex querying, Amazon Redshift is an ideal choice. Utilizing Amazon EMR (Elastic MapReduce) enables the transformation and preparation of this data before it is loaded into Redshift. EMR provides powerful data processing capabilities using Apache Hadoop, Spark, etc., making it suitable for large-scale data analytics. Combining Kinesis with EMR and Redshift ensures an end-to-end solution from data ingestion to analysis and storage, meeting the project requirements fully. Thus, option B leverages Kinesis Data Streams for real-time data ingestion and analysis, combined with Amazon EMR for processing and Amazon Redshift for storing and querying the processed data, making it the most comprehensive and robust strategy for the given requirements.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A company is developing a gene reporting device that will collect genomic information to assist researchers with collecting large samples of data from a diverse population. The device will push 8 KB of genomic data every second to a data platform that will need to process and analyze the data and provide information back to researchers. The data platform must meet the following requirements. Provide near-real-time analytics of the inbound genomic data Ensure the data is flexible, parallel, and durable Deliver results of processing to a data warehouse Which strategy should a solutions architect use to meet these requirements?
A
Use Amazon Kinesis Data Firehose to collect the inbound sensor data, analyze the data with Kinesis clients, and save the results to an Amazon RDS instance.
B
Use Amazon Kinesis Data Streams to collect the inbound sensor data, analyze the data with Kinesis clients, and save the results to an Amazon Redshift cluster using Amazon EMR.
C
Use Amazon S3 to collect the inbound device data, analyze the data from Amazon SQS with Kinesis, and save the results to an Amazon Redshift cluster.
D
Use an Amazon API Gateway to put requests into an Amazon SQS queue, analyze the data with an AWS Lambda function,and save the results to an Amazon Redshift cluster using Amazon EMR.