
Answer-first summary for fast verification
Answer: Use Amazon S3 as an intermediary storage, dump data from Firehose to S3, trigger Lambda functions from S3 events, and then load data from S3 to Redshift.
Option A is the most efficient solution for handling high volumes of data. It uses S3 as a buffer, which helps manage the flow of data and allows for more controlled and scalable processing by Lambda before loading into Redshift.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with creating a data pipeline for a retail company that needs to analyze customer behavior. The pipeline must extract data from Amazon Kinesis Data Firehose, transform it using AWS Lambda, and load it into Amazon Redshift. How would you design this pipeline to handle high volumes of data efficiently?
A
Use Amazon S3 as an intermediary storage, dump data from Firehose to S3, trigger Lambda functions from S3 events, and then load data from S3 to Redshift.
B
Directly stream data from Firehose to Lambda and then to Redshift without any intermediary storage.
C
Manually run AWS Glue jobs to extract data from Firehose, transform it, and then load it into Redshift.
D
Use Amazon SQS to queue Firehose events and have Lambda functions poll the queue for processing.
No comments yet.