AWS Certified Cloud Practitioner

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A company uses Amazon SageMaker for its ML pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company needs near real-time latency. Which SageMaker inference option meets these requirements?

Exam-Like

Community

RRitesh

Last updated: December 8, 2025 at 19:13

Real-time inference

Serverless inference

Asynchronous inference

Batch transform

Explanation:

Explanation

Correct Answer: C. Asynchronous inference

Why Asynchronous Inference is the Right Choice:

Large Input Data (up to 1 GB): Asynchronous inference is designed to handle large payloads (up to 1 GB) that exceed the typical limits of real-time inference (usually up to 6 MB).
Long Processing Times (up to 1 hour): Real-time inference typically has timeout limits of 60 seconds, while asynchronous inference can handle processing times of up to 15 minutes (with the ability to extend for longer-running jobs).
Near Real-time Latency Requirement: Asynchronous inference provides near real-time results by queuing requests and processing them asynchronously, then delivering results via Amazon S3 or Amazon SNS notifications.

Why Other Options Are Incorrect:

A. Real-time inference: Limited to smaller payloads (typically up to 6 MB) and shorter processing times (usually 60-second timeout). Cannot handle 1 GB files or 1-hour processing times.
B. Serverless inference: Similar limitations to real-time inference in terms of payload size and processing time constraints.
D. Batch transform: Designed for offline processing of large datasets, not for near real-time requirements. It processes data in batches with no latency guarantees.

Key Characteristics of Asynchronous Inference:

Payload Size: Up to 1 GB
Processing Time: Up to 15 minutes (configurable for longer)
Result Delivery: Via Amazon S3 or Amazon SNS
Use Cases: Large document processing, video analysis, genome sequencing, and other compute-intensive ML tasks

This solution allows the company to process large ML workloads while maintaining near real-time responsiveness through asynchronous processing patterns.

Powered ByGPT-5.2

Comments

Loading comments...