AWS Certified AI Practitioner

Get started today

Ultimate access to all questions.

Explanation:

Analysis of SageMaker Inference Options

Based on the requirements specified in the question:

Large input data sizes up to 1 GB
Processing times up to 1 hour
Near real-time latency requirements

Let's evaluate each SageMaker inference option:

A: Real-time Inference

Purpose: Designed for low-latency, synchronous inference with immediate responses
Payload size: Typically limited to 6 MB per request
Processing time: Optimized for milliseconds to seconds
Suitability: Not optimal for this scenario because:
- Cannot handle 1 GB payloads
- Not designed for hour-long processing times
- Would likely timeout or fail with such large, long-running requests

B: Serverless Inference

Purpose: Provides automatic scaling without infrastructure management
Payload size: Limited to 6 MB per request
Processing time: Designed for short-duration inference (seconds)
Suitability: Not appropriate because:
- Cannot accommodate 1 GB input sizes
- Not built for hour-long processing
- Better suited for intermittent traffic with smaller payloads

C: Asynchronous Inference

Purpose: Specifically designed for large payloads and longer processing times
Payload size: Supports up to 1 GB per request
Processing time: Handles processing times up to 1 hour
Latency: Provides near real-time responses (not immediate, but within reasonable timeframes)
Suitability: OPTIMAL CHOICE because:
- Directly matches the 1 GB payload requirement
- Specifically supports up to 1 hour processing time
- Designed for near real-time latency scenarios
- Queues requests and processes them asynchronously, preventing timeouts

D: Batch Transform

Purpose: Designed for offline, large-scale batch processing
Payload size: Can handle large datasets
Processing time: Suitable for long-running jobs
Latency: Not real-time or near real-time; results are available after complete processing
Suitability: Not appropriate because:
- Does not provide near real-time latency
- Designed for offline processing rather than production inference
- Results are not available incrementally

Conclusion

Asynchronous Inference (Option C) is the correct choice because it is the only SageMaker inference option specifically engineered to handle:

Large payloads up to 1 GB (matching the requirement exactly)
Long processing times up to 1 hour (directly supported)
Near real-time latency (queues requests and provides results when ready, suitable for production environments where immediate response isn't critical but timely results are needed)

The other options either cannot handle the payload size, are not designed for such long processing times, or do not provide the required latency characteristics.

Explanation:

Analysis of SageMaker Inference Options

Based on the requirements specified in the question:

Large input data sizes up to 1 GB
Processing times up to 1 hour
Near real-time latency requirements

Let's evaluate each SageMaker inference option:

A: Real-time Inference

Purpose: Designed for low-latency, synchronous inference with immediate responses
Payload size: Typically limited to 6 MB per request
Processing time: Optimized for milliseconds to seconds
Suitability: Not optimal for this scenario because:
- Cannot handle 1 GB payloads
- Not designed for hour-long processing times
- Would likely timeout or fail with such large, long-running requests

B: Serverless Inference

Purpose: Provides automatic scaling without infrastructure management
Payload size: Limited to 6 MB per request
Processing time: Designed for short-duration inference (seconds)
Suitability: Not appropriate because:
- Cannot accommodate 1 GB input sizes
- Not built for hour-long processing
- Better suited for intermittent traffic with smaller payloads

C: Asynchronous Inference

Purpose: Specifically designed for large payloads and longer processing times
Payload size: Supports up to 1 GB per request
Processing time: Handles processing times up to 1 hour
Latency: Provides near real-time responses (not immediate, but within reasonable timeframes)
Suitability: OPTIMAL CHOICE because:
- Directly matches the 1 GB payload requirement
- Specifically supports up to 1 hour processing time
- Designed for near real-time latency scenarios
- Queues requests and processes them asynchronously, preventing timeouts

D: Batch Transform

Purpose: Designed for offline, large-scale batch processing
Payload size: Can handle large datasets
Processing time: Suitable for long-running jobs
Latency: Not real-time or near real-time; results are available after complete processing
Suitability: Not appropriate because:
- Does not provide near real-time latency
- Designed for offline processing rather than production inference
- Results are not available incrementally

Conclusion

Asynchronous Inference (Option C) is the correct choice because it is the only SageMaker inference option specifically engineered to handle:

Large payloads up to 1 GB (matching the requirement exactly)
Long processing times up to 1 hour (directly supported)
Near real-time latency (queues requests and provides results when ready, suitable for production environments where immediate response isn't critical but timely results are needed)

The other options either cannot handle the payload size, are not designed for such long processing times, or do not provide the required latency characteristics.

Comments (0)

No comments yet.

A company uses Amazon SageMaker for its machine learning pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company requires near real-time latency.

Which SageMaker inference option meets these requirements?

Exam-Like

Last updated: February 8, 2026 at 20:17

Real-time inference

7.1%

Serverless inference

14.3%

Asynchronous inference

64.3%

AWS Certified AI Practitioner

Get started today

Analysis of SageMaker Inference Options

A: Real-time Inference

B: Serverless Inference

C: Asynchronous Inference

D: Batch Transform

Conclusion

Analysis of SageMaker Inference Options

A: Real-time Inference

B: Serverless Inference

C: Asynchronous Inference

D: Batch Transform

Conclusion

Comments (0)

Get started today

Comments (0)

A company uses Amazon SageMaker for its machine learning pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company requires near real-time latency. Which SageMaker inference option meets these requirements?

A company uses Amazon SageMaker for its machine learning pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company requires near real-time latency.

Which SageMaker inference option meets these requirements?