
Explanation:
Option A best meets the combined requirements of low latency, stability, and validated safety controls by using purpose-built Amazon Bedrock features designed for production GenAI operations. The company's latency target of under 1 second and its observation of degradation during spikes strongly indicate capacity and throughput variability. Provisioned throughput for Amazon Bedrock is intended to deliver more predictable performance by reserving inference capacity for a chosen model, reducing throttling risk and stabilizing response times under load. This directly improves operational consistency across Regions where on-demand capacity can vary.
The requirement to "block unsafe or hallucinated recommendations" is most directly addressed by Amazon Bedrock Guardrails. Guardrails provide managed safety enforcement, including sensitive information controls and configurable content policies. Using semantic denial rules enables the application to prevent unsafe guidance such as dangerous brewing temperatures or other harmful procedural instructions, enforcing safety at the model boundary rather than relying on downstream filtering.
The remaining requirement is "99.5% output consistency for identical inputs." While generative models can be probabilistic, production systems achieve practical consistency by controlling prompt versions, inputs, and policy behavior. Amazon Bedrock Prompt Management supports controlled prompt lifecycle practices, including versioning and approval workflows, which reduce unintended drift across deployments and Regions. By ensuring the same approved prompt templates and parameters are used consistently, the company can materially improve repeatability for the same structured inputs and retrieval context, which is essential in multi-stage prompt chains.
The other options are incomplete. B improves experimentation and observability but does not enforce safety controls or stabilize latency. C can improve performance, but it does not provide validated safety enforcement at inference time. D can help retrieval relevance, but it does not address unsafe outputs or inference stability. Therefore, A is the only option that simultaneously targets predictable latency, governance of prompt behavior, and strong safety controls within Amazon Bedrock.
Ultimate access to all questions.
No comments yet.
A company is building a generative AI application to provide step-by-step instructions for brewing specialty coffee. The application uses a multi-stage prompt chain that retrieves relevant coffee bean roasting logs from a knowledge base and then generates brewing recommendations. The company needs to ensure low latency, stability, and validated safety controls for production deployment across multiple AWS Regions. Requirements: • Latency must be under 1 second per request, and the company has observed degradation during traffic spikes. • The application must block unsafe or hallucinated recommendations (e.g., dangerous brewing temperatures). • Output consistency for identical inputs must be 99.5% or higher across Regions. Which solution meets these requirements?
A
Use Amazon Bedrock Provisioned Throughput for consistent latency. Implement Amazon Bedrock Guardrails with semantic denial rules to block unsafe content. Use Amazon Bedrock Prompt Management to enforce prompt versioning and approval workflows.
B
Use Amazon SageMaker JumpStart to deploy a custom model with autoscaling. Implement AWS WAF to filter unsafe content. Use Amazon CloudWatch to monitor latency and consistency metrics.
C
Use Amazon Bedrock On-Demand throughput with caching via Amazon ElastiCache. Implement AWS Lambda to validate outputs against a safety database. Use Amazon EventBridge to orchestrate prompt chains.
D
Use Amazon Kendra to improve roast log retrieval accuracy. Store normalized prompt metadata within Amazon DynamoDB. Use AWS Step Functions to orchestrate multi-step prompts.