
Answer-first summary for fast verification
Answer: Choose an appropriate evaluation metric (such as recall or NDCG) and experiment with changes in the chunking strategy, such as splitting chunks by paragraphs or chapters. Choose the strategy that gives the best performance metric., Create an LLM-as-a-judge metric to evaluate how well previous questions are answered by the most appropriate chunk. Optimize the chunking parameters based upon the values of the metric.
The question asks for TWO strategies to optimize chunking strategy and parameters. Option C involves using appropriate evaluation metrics (like recall or NDCG) and experimenting with different chunking strategies (paragraphs, chapters) to select the best performing one based on metrics. Option E uses an LLM-as-a-judge to evaluate how well questions are answered by the most appropriate chunk and optimizes chunking parameters based on this metric. Both C and E directly address systematic optimization of chunking through evaluation and experimentation, which aligns with the goal of moving beyond intuition. The community discussion strongly supports CE, noting that these are the only options focusing on chunking evaluation, with D having limitations due to token count constraints.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A Generative AI Engineer has developed a RAG application to answer questions about a series of fantasy novels from the author's web forum. The text from the novels is chunked and embedded into a vector store along with metadata (page number, chapter number, book title). These chunks are retrieved based on the user's query and sent to an LLM to generate a response. The engineer initially selected the chunking strategy and its configurations based on intuition but now wants to choose the optimal values more systematically.
Which TWO strategies should the engineer use to optimize their chunking strategy and parameters? (Select two.)
A
Change embedding models and compare performance.
B
Add a classifier for user queries that predicts which book will best contain the answer. Use this to filter retrieval.
C
Choose an appropriate evaluation metric (such as recall or NDCG) and experiment with changes in the chunking strategy, such as splitting chunks by paragraphs or chapters. Choose the strategy that gives the best performance metric.
D
Pass known questions and best answers to an LLM and instruct the LLM to provide the best token count. Use a summary statistic (mean, median, etc.) of the best token counts to choose chunk size.
E
Create an LLM-as-a-judge metric to evaluate how well previous questions are answered by the most appropriate chunk. Optimize the chunking parameters based upon the values of the metric.