AWS Certified AI Practitioner

Ultimate access to all questions.

Explanation:

Explanation

Cosine Similarity is the correct answer because:

Vector-based similarity search: When both the query and documents are converted into vectors (embeddings), cosine similarity is commonly used to measure the similarity between these vectors.
How it works: Cosine similarity calculates the cosine of the angle between two vectors in a multi-dimensional space. It ranges from -1 (completely opposite) to 1 (identical), with 0 indicating orthogonality (no similarity).
Why not the other options:
- Decision Trees (A): Used for classification and regression tasks, not similarity search.
- Random Forest (C): An ensemble method combining multiple decision trees, used for classification/regression, not similarity search.
- Token Merging (D): Typically refers to text processing techniques like tokenization or stemming, not similarity measurement between vectors.
Real-world application: This scenario describes a typical semantic search or information retrieval system where documents and queries are embedded into vector space, and cosine similarity helps find the most relevant documents by measuring semantic similarity rather than just keyword matching.
Alternative similarity measures: While cosine similarity is most common, other distance metrics like Euclidean distance or dot product could also be used, but cosine similarity is preferred for text embeddings because it's insensitive to vector magnitude and focuses on direction (semantic meaning).

Explanation:

Cosine Similarity is the correct answer because:

Vector-based similarity search: When both the query and documents are converted into vectors (embeddings), cosine similarity is commonly used to measure the similarity between these vectors.
How it works: Cosine similarity calculates the cosine of the angle between two vectors in a multi-dimensional space. It ranges from -1 (completely opposite) to 1 (identical), with 0 indicating orthogonality (no similarity).
Why not the other options:
- Decision Trees (A): Used for classification and regression tasks, not similarity search.
- Random Forest (C): An ensemble method combining multiple decision trees, used for classification/regression, not similarity search.
- Token Merging (D): Typically refers to text processing techniques like tokenization or stemming, not similarity measurement between vectors.
Real-world application: This scenario describes a typical semantic search or information retrieval system where documents and queries are embedded into vector space, and cosine similarity helps find the most relevant documents by measuring semantic similarity rather than just keyword matching.
Alternative similarity measures: While cosine similarity is most common, other distance metrics like Euclidean distance or dot product could also be used, but cosine similarity is preferred for text embeddings because it's insensitive to vector magnitude and focuses on direction (semantic meaning).

No comments yet.

Real Exam

Community

JJin

Last updated: February 18, 2026 at 14:05

Decision Trees

23.5%

Cosine Similarity

64.7%

Random Forest

11.8%

Token Merging