
Explanation:
Option B is correct because it provides a comprehensive solution that addresses all requirements:
Automated quality evaluations at scale: Amazon Bedrock evaluations with Anthropic Claude Sonnet as a judge model enables systematic, repeatable quality assessment across large volumes of interactions. This addresses the need for scalable automated evaluation of factual accuracy and conversational appropriateness.
Compliance enforcement: Amazon Bedrock guardrails provide a dedicated policy enforcement layer that can block or intervene when responses violate financial compliance constraints. This is crucial for financial regulations where automated enforcement is needed.
Targeted human reviews: Amazon Augmented AI (A2I) integrates human review workflows for flagged critical interactions, ensuring human oversight where needed without requiring manual review of all responses.
Why other options are incorrect:
Option A: Relies entirely on manual scoring by financial experts for ALL responses, which doesn't scale and doesn't provide automated quality evaluations at scale.
Option C: Uses Amazon Lex (a chatbot framework) rather than leveraging Amazon Bedrock's evaluation capabilities. The static compliance database approach is less flexible than Bedrock guardrails, and collecting end-user reviews doesn't provide systematic quality evaluation.
Option D: CloudWatch is for monitoring and alerting, not for systematic evaluation of response quality. It lacks the automated evaluation capabilities and compliance enforcement mechanisms provided by Bedrock evaluations and guardrails.
This solution effectively combines AWS's managed GenAI capabilities to meet the requirements of scalable automated evaluation, compliance enforcement, and targeted human oversight.
Ultimate access to all questions.
No comments yet.
A financial technology company is using Amazon Bedrock to build an assessment system for the company's customer service AI assistant. The AI assistant must provide financial recommendations that are factually accurate, compliant with financial regulations, and conversationally appropriate. The company needs to combine automated quality evaluations at scale with targeted human reviews of critical interactions.
What solution will meet these requirements?
A
Configure a pipeline in which financial experts manually score all responses for accuracy, compliance, and conversational quality. Use Amazon SageMaker notebooks to analyze results to identify improvement areas.
B
Configure Amazon Bedrock evaluations that use Anthropic Claude Sonnet as a judge model to assess response accuracy and appropriateness. Configure custom Amazon Bedrock guardrails to check responses for compliance with financial policies. Add Amazon Augmented AI (Amazon A2I) human reviews for flagged critical interactions.
C
Create an Amazon Lex bot to manage customer service interactions. Configure AWS Lambda functions to check responses against a static compliance database. Configure intents that call the Lambda functions. Add an additional intent to collect end-user reviews.
D
Configure Amazon CloudWatch to monitor response patterns from the AI assistant. Configure CloudWatch alerts for potential compliance violations. Establish a team of human evaluators to review flagged interactions.