
Answer-first summary for fast verification
Answer: Compare the cosine similarities of the embeddings of returned results against those of a representative sample of test inputs
The question focuses on evaluating semantic accuracy of vector indexing techniques (LSH vs HNSW). Semantic accuracy refers to how well the retrieved vectors preserve the meaning or similarity relationships in the original embedding space. Cosine similarity directly measures the angular similarity between embedding vectors, which aligns with semantic relationships in high-dimensional spaces. BLEU and ROUGE scores are designed for evaluating text generation quality (machine translation and summarization respectively), not vector similarity. Levenshtein distance measures edit distance between strings, which is inappropriate for comparing vector embeddings. The community discussion shows consensus for cosine similarity, with one comment correctly noting that Levenshtein distance is not applicable to semantic accuracy of vector indexing.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A Generative AI Engineer is evaluating LSH (Locality Sensitive Hashing) and HNSW (Hierarchical Navigable Small World) for indexing a vector database, with semantic accuracy as the primary concern.
Which method should be used to compare these two indexing techniques?
A
Compare the cosine similarities of the embeddings of returned results against those of a representative sample of test inputs
B
Compare the Bilingual Evaluation Understudy (BLEU) scores of returned results for a representative sample of test inputs
C
Compare the Recall-Oriented-Understudy for Gisting Evaluation (ROUGE) scores of returned results for a representative sample of test inputs
D
Compare the Levenshtein distances of returned results against a representative sample of test inputs
No comments yet.