
Answer-first summary for fast verification
Answer: Asynchronous inference
## Analysis of SageMaker Inference Options Based on the requirements specified in the question: - **Large input data sizes up to 1 GB** - **Processing times up to 1 hour** - **Near real-time latency requirements** Let's evaluate each SageMaker inference option: ### **A: Real-time Inference** - **Purpose**: Designed for low-latency, synchronous inference with immediate responses - **Payload size**: Typically limited to 6 MB per request - **Processing time**: Optimized for milliseconds to seconds - **Suitability**: Not optimal for this scenario because: - Cannot handle 1 GB payloads - Not designed for hour-long processing times - Would likely timeout or fail with such large, long-running requests ### **B: Serverless Inference** - **Purpose**: Provides automatic scaling without infrastructure management - **Payload size**: Limited to 6 MB per request - **Processing time**: Designed for short-duration inference (seconds) - **Suitability**: Not appropriate because: - Cannot accommodate 1 GB input sizes - Not built for hour-long processing - Better suited for intermittent traffic with smaller payloads ### **C: Asynchronous Inference** - **Purpose**: Specifically designed for large payloads and longer processing times - **Payload size**: Supports up to 1 GB per request - **Processing time**: Handles processing times up to 1 hour - **Latency**: Provides near real-time responses (not immediate, but within reasonable timeframes) - **Suitability**: **OPTIMAL CHOICE** because: - Directly matches the 1 GB payload requirement - Specifically supports up to 1 hour processing time - Designed for near real-time latency scenarios - Queues requests and processes them asynchronously, preventing timeouts ### **D: Batch Transform** - **Purpose**: Designed for offline, large-scale batch processing - **Payload size**: Can handle large datasets - **Processing time**: Suitable for long-running jobs - **Latency**: Not real-time or near real-time; results are available after complete processing - **Suitability**: Not appropriate because: - Does not provide near real-time latency - Designed for offline processing rather than production inference - Results are not available incrementally ## Conclusion **Asynchronous Inference (Option C)** is the correct choice because it is the only SageMaker inference option specifically engineered to handle: 1. **Large payloads up to 1 GB** (matching the requirement exactly) 2. **Long processing times up to 1 hour** (directly supported) 3. **Near real-time latency** (queues requests and provides results when ready, suitable for production environments where immediate response isn't critical but timely results are needed) The other options either cannot handle the payload size, are not designed for such long processing times, or do not provide the required latency characteristics.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
A company uses Amazon SageMaker for its machine learning pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company requires near real-time latency.
Which SageMaker inference option meets these requirements?
A
Real-time inference
B
Serverless inference
C
Asynchronous inference
D
Batch transform