AWS Certified Cloud Practitioner

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A hospital research team runs several generative-AI workloads using Bedrock. To reduce cost while maintaining performance, which strategy should they follow?

Real Exam

Community

RRitesh

Last updated: December 3, 2025 at 18:26

Use fine-tuning for every patient query

Adopt RAG with Knowledge Bases to retrieve only relevant context at runtime

Store entire medical records in each prompt

Scale up GPU instances permanently

Explanation:

Explanation

Correct Answer: B - Adopt RAG with Knowledge Bases to retrieve only relevant context at runtime

Why this is correct:

RAG (Retrieval-Augmented Generation) with Knowledge Bases allows the model to retrieve only relevant information from a knowledge base at runtime, rather than processing all data in every query.
Cost Reduction: By retrieving only relevant context, you reduce the amount of data processed per query, which lowers computational costs.
Performance Maintenance: RAG maintains or improves performance by providing the model with precise, relevant information rather than overwhelming it with unnecessary data.
AWS Bedrock Integration: AWS Bedrock supports Knowledge Bases for Amazon Bedrock, which is specifically designed for RAG implementations.

Why other options are incorrect:

A) Use fine-tuning for every patient query: Fine-tuning is expensive and time-consuming for every query, and doesn't scale well for dynamic patient data.
C) Store entire medical records in each prompt: This would increase token usage dramatically, leading to higher costs and potentially worse performance due to context window limitations.
D) Scale up GPU instances permanently: This would increase costs without addressing the root cause of inefficiency and doesn't optimize for cost reduction.

Key AWS Concepts:

AWS Bedrock Knowledge Bases: Provide managed RAG capabilities with vector storage and retrieval
Cost Optimization: Reducing token usage and computational requirements
Performance Efficiency: Balancing cost with response quality and speed

Powered ByGemini-3 Flash

Comments

Loading comments...