AWS Certified Generative AI Developer - Professional

Get started today

Ultimate access to all questions.

Explanation:

Explanation

Why Option B is correct:

Amazon Transcribe streaming with partial results enabled: This is crucial for real-time transcription where text fragments are delivered before customers finish speaking, meeting the requirement for "incremental suggestions while a customer is still speaking."
InvokeModelWithResponseStream API: This API enables streaming responses from Amazon Bedrock, allowing the GenAI model to provide incremental suggestions as it processes the incoming text fragments, maintaining low latency.
Amazon API Gateway WebSocket API: This supports bidirectional streaming, enabling real-time updates to call center agents as suggestions are generated.
End-to-end latency under 1 second: The combination of streaming transcription, streaming model inference, and WebSocket delivery creates a pipeline optimized for low latency.

Why other options are incorrect:

Option A: Uses InvokeModel API (non-streaming) and stores results in DynamoDB, which adds latency and doesn't support real-time incremental updates.

Option C: Uses batch processing (not real-time) and requires complete transcripts, violating the requirement for suggestions while customers are still speaking.

Option D: Uses Titan Embeddings model (not generative AI for suggestions) and publishes to SNS (asynchronous), which doesn't support real-time bidirectional streaming to agents.

Key requirements met by Option B:

Real-time transcription with partial results
Streaming inference from Bedrock
Bidirectional WebSocket communication
Sub-second latency
Uses only managed AWS services

Explanation:

Explanation

Why Option B is correct:

Amazon Transcribe streaming with partial results enabled: This is crucial for real-time transcription where text fragments are delivered before customers finish speaking, meeting the requirement for "incremental suggestions while a customer is still speaking."
InvokeModelWithResponseStream API: This API enables streaming responses from Amazon Bedrock, allowing the GenAI model to provide incremental suggestions as it processes the incoming text fragments, maintaining low latency.
Amazon API Gateway WebSocket API: This supports bidirectional streaming, enabling real-time updates to call center agents as suggestions are generated.
End-to-end latency under 1 second: The combination of streaming transcription, streaming model inference, and WebSocket delivery creates a pipeline optimized for low latency.

Why other options are incorrect:

Option A: Uses InvokeModel API (non-streaming) and stores results in DynamoDB, which adds latency and doesn't support real-time incremental updates.

Option C: Uses batch processing (not real-time) and requires complete transcripts, violating the requirement for suggestions while customers are still speaking.

Option D: Uses Titan Embeddings model (not generative AI for suggestions) and publishes to SNS (asynchronous), which doesn't support real-time bidirectional streaming to agents.

Key requirements met by Option B:

Real-time transcription with partial results
Streaming inference from Bedrock
Bidirectional WebSocket communication
Sub-second latency
Uses only managed AWS services

Comments (0)

No comments yet.

A financial services company is developing a real-time generative AI (GenAI) assistant to support human call center agents. The GenAI assistant must transcribe live customer speech, analyze context, and provide incremental suggestions to call center agents while a customer is still speaking. To preserve responsiveness, the GenAI assistant must maintain end-to-end latency under 1 second from speech to initial response display. The architecture must use only managed AWS services and must support bidirectional streaming to ensure that call center agents receive updates in real time.

Which solution will meet these requirements?

Real Exam

Community

DDucse

Last updated: March 24, 2026 at 07:14

Use Amazon Transcribe streaming to transcribe calls. Pass the text to Amazon Comprehend for sentiment analysis. Feed the results to Anthropic Claude on Amazon Bedrock by using the InvokeModel API. Store results in Amazon DynamoDB. Use a WebSocket API to display the results.

0.0%