
Answer-first summary for fast verification
Answer: Asynchronous inference
## Explanation **Correct Answer: C. Asynchronous inference** ### Why Asynchronous Inference is the Right Choice: 1. **Large Input Data (up to 1 GB)**: Asynchronous inference is designed to handle large payloads (up to 1 GB) that exceed the typical limits of real-time inference (usually up to 6 MB). 2. **Long Processing Times (up to 1 hour)**: Real-time inference typically has timeout limits of 60 seconds, while asynchronous inference can handle processing times of up to 15 minutes (with the ability to extend for longer-running jobs). 3. **Near Real-time Latency Requirement**: Asynchronous inference provides near real-time results by queuing requests and processing them asynchronously, then delivering results via Amazon S3 or Amazon SNS notifications. ### Why Other Options Are Incorrect: - **A. Real-time inference**: Limited to smaller payloads (typically up to 6 MB) and shorter processing times (usually 60-second timeout). Cannot handle 1 GB files or 1-hour processing times. - **B. Serverless inference**: Similar limitations to real-time inference in terms of payload size and processing time constraints. - **D. Batch transform**: Designed for offline processing of large datasets, not for near real-time requirements. It processes data in batches with no latency guarantees. ### Key Characteristics of Asynchronous Inference: - **Payload Size**: Up to 1 GB - **Processing Time**: Up to 15 minutes (configurable for longer) - **Result Delivery**: Via Amazon S3 or Amazon SNS - **Use Cases**: Large document processing, video analysis, genome sequencing, and other compute-intensive ML tasks This solution allows the company to process large ML workloads while maintaining near real-time responsiveness through asynchronous processing patterns.
Author: Ritesh Yadav
Ultimate access to all questions.
A company uses Amazon SageMaker for its ML pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company needs near real-time latency. Which SageMaker inference option meets these requirements?
A
Real-time inference
B
Serverless inference
C
Asynchronous inference
D
Batch transform
No comments yet.