
Answer-first summary for fast verification
Answer: Use the AWS Step Functions Map state in Distributed mode to process the data in parallel.
## Explanation **Correct Answer: B - Use the AWS Step Functions Map state in Distributed mode to process the data in parallel.** ### Why Option B is Correct: 1. **AWS Step Functions Map State in Distributed Mode** is specifically designed for large-scale parallel processing of data stored in Amazon S3. 2. **Distributed Mode** can process thousands of items in parallel by dynamically creating multiple parallel executions, making it ideal for processing large datasets. 3. **Serverless Architecture**: Both Step Functions and Lambda are serverless, meeting the requirement for a serverless solution. 4. **Operational Efficiency**: Distributed mode automatically handles the orchestration of parallel processing without requiring manual management of multiple Lambda functions. 5. **S3 Integration**: Step Functions can directly integrate with S3 to process files stored there. ### Why Other Options Are Less Optimal: **A. AWS Step Functions Map state in Inline mode**: - Inline mode has limitations on the number of parallel executions (up to 40) and payload size (256KB). - Not suitable for processing thousands of items in parallel at scale. **C. AWS Glue**: - While AWS Glue can process data in parallel, it's primarily designed for ETL (Extract, Transform, Load) jobs and data cataloging. - Less suitable for on-demand processing as it typically involves longer startup times and is better for scheduled batch processing. - Higher operational overhead compared to serverless Step Functions. **D. Use several AWS Lambda functions**: - While technically possible, this would require manual orchestration and coordination of multiple Lambda functions. - Less operationally efficient than using Step Functions Map state which automatically handles parallel execution and error handling. - Would require additional code for coordination, error handling, and result aggregation. ### Key AWS Concepts: - **AWS Step Functions Map State**: A workflow pattern that processes multiple items in parallel. - **Distributed Mode**: Creates separate executions for each item, allowing for massive parallel processing (up to 10,000 parallel executions). - **Inline Mode**: Processes items within a single execution with limitations on concurrency and payload size. - **Serverless Processing**: Both Lambda and Step Functions provide serverless compute and orchestration without managing infrastructure. For large-scale parallel on-demand processing of semistructured data stored in S3, AWS Step Functions Map state in Distributed mode provides the most operationally efficient solution with automatic scaling, error handling, and minimal management overhead.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
A company recently migrated to the AWS Cloud. The company wants a serverless solution for large-scale parallel on-demand processing of a semistructured dataset. The data consists of logs, media files, sales transactions, and IoT sensor data that is stored in Amazon S3. The company wants the solution to process thousands of items in the dataset in parallel.
Which solution will meet these requirements with the MOST operational efficiency?
A
Use the AWS Step Functions Map state in Inline mode to process the data in parallel.
B
Use the AWS Step Functions Map state in Distributed mode to process the data in parallel.
C
Use AWS Glue to process the data in parallel.
D
Use several AWS Lambda functions to process the data in parallel.