
Explanation:
Option B is the correct answer because it comprehensively addresses all three requirements:
Diagnoses inconsistencies: By implementing a test suite with complex clinical documents, the company can systematically reproduce and diagnose the inconsistency issues that only appear with complex documents.
Compares prompt performance against established metrics: Using quantifiable evaluation metrics and an automated testing framework allows for objective comparison of different prompt versions against measurable criteria.
Maintains historical records of prompt versions: Version control in a code repository provides a complete history of prompt changes, allowing the team to track evolution, revert to previous versions if needed, and understand what changes led to performance improvements or regressions.
Why other options are insufficient:
Option A: Only tests with simple documents (which already work well), lacks objective metrics, and doesn't provide systematic diagnosis or historical tracking.
Option C: Focuses on A/B testing in production but doesn't provide systematic diagnosis of inconsistencies or maintain comprehensive historical records of prompt performance.
Option D: Adds medical entity analysis but creates unnecessary service coupling, lacks automated benchmarking, and doesn't provide robust version history management.
Key AWS Services & Concepts:
Ultimate access to all questions.
No comments yet.
A medical company uses Amazon Bedrock to power a clinical documentation summarization system. The system produces inconsistent summaries when handling complex clinical documents. The system performed well on simple clinical documents.
The company needs a solution that diagnoses inconsistencies, compares prompt performance against established metrics, and maintains historical records of prompt versions.
Which solution will meet these requirements?
A
Create multiple prompt variants by using Prompt management in Amazon Bedrock. Manually test the prompts with simple clinical documents. Deploy the highest performing version by using the Amazon Bedrock console.
B
Implement version control for prompts in a code repository with a test suite that contains complex clinical documents and quantifiable evaluation metrics. Use an automated testing framework to compare prompt versions and document performance patterns.
C
Deploy each new prompt version to separate Amazon Bedrock API endpoints. Split production traffic between the endpoints. Configure Amazon CloudWatch to capture response metrics and user feedback for automatic version selection.
D
Create a custom prompt evaluation flow in Amazon Bedrock Flows that applies the same clinical document inputs to different prompt variants. Use Amazon Comprehend Medical to analyze and score the factual accuracy of each version.