AWS Certified Generative AI Developer - Professional

Get started today

Ultimate access to all questions.

Explanation:

Explanation

Option A is the correct solution because hybrid search directly addresses the core retrieval failure modes while maintaining low latency and minimal operational overhead. In medical and scientific domains, exact terminology, abbreviations, and acronyms (for example, drug names, procedures, or conditions) are critical. Pure vector similarity search often underweights these exact matches, leading to missed results and excessive semantically related but irrelevant documents.

Amazon OpenSearch Service natively supports hybrid search, which combines keyword-based retrieval (such as BM25) with vector similarity search. Keyword search ensures precise matching for exact terms and acronyms, while vector search captures semantic meaning and contextual similarity. By blending these approaches, the retrieval system improves both precision and recall without introducing additional infrastructure.

Hybrid search operates within the same OpenSearch index and query path, which preserves low end-user latency even at large scale. This is especially important as the document collection grows to millions of documents. Because OpenSearch handles scoring and ranking internally, no additional orchestration layers or post-processing steps are required.

Why other options are less optimal:

Option B: Increasing vector dimensions from 384 to 1536 increases computational cost and latency while failing to address exact-term recall. The post-processing Lambda function adds operational overhead and additional latency.
Option C: Replacing OpenSearch with Amazon Kendra introduces a new service and ingestion pipeline, increasing operational overhead and latency. While Kendra has query expansion capabilities, it requires significant migration effort and may not maintain the same low latency at scale.
Option D: Implementing a two-stage retrieval architecture with ML model re-ranking adds model hosting, re-ranking infrastructure, and complexity that is unnecessary when OpenSearch provides native hybrid retrieval. This significantly increases operational overhead and latency.

Therefore, Option A delivers the best balance of retrieval quality, scalability, latency, and operational simplicity for medical RAG workloads.

Explanation:

Explanation

Why other options are less optimal:

Option B: Increasing vector dimensions from 384 to 1536 increases computational cost and latency while failing to address exact-term recall. The post-processing Lambda function adds operational overhead and additional latency.
Option C: Replacing OpenSearch with Amazon Kendra introduces a new service and ingestion pipeline, increasing operational overhead and latency. While Kendra has query expansion capabilities, it requires significant migration effort and may not maintain the same low latency at scale.
Option D: Implementing a two-stage retrieval architecture with ML model re-ranking adds model hosting, re-ranking infrastructure, and complexity that is unnecessary when OpenSearch provides native hybrid retrieval. This significantly increases operational overhead and latency.

Therefore, Option A delivers the best balance of retrieval quality, scalability, latency, and operational simplicity for medical RAG workloads.

Comments (0)

No comments yet.

A medical company is building a generative AI (GenAI) application that uses Retrieval Augmented Generation (RAG) to provide evidence-based medical information. The application uses Amazon OpenSearch Service to retrieve vector embeddings. Users report that searches frequently miss results that contain exact medical terms and acronyms and return too many semantically similar but irrelevant documents. The company needs to improve retrieval quality and maintain low end-user latency, even as the document collection grows to millions of documents.

Which solution will meet these requirements with the LEAST operational overhead?

Real Exam

Community

DDucse

Last updated: March 23, 2026 at 10:35

Configure hybrid search by combining vector similarity with keyword matching to improve semantic understanding and exact term and acronym matching.

100.0%