
Answer-first summary for fast verification
Answer: Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.
Option D represents the standard and most effective approach for building a retrieval-augmented generation (RAG) system, which is the recommended architecture for answering questions based on document knowledge. It involves: 1) chunking the HR PDF documentation into manageable pieces, 2) creating embeddings and storing them in a vector database, 3) retrieving the most relevant chunks using the employee's question, and 4) generating a response using an LLM with the retrieved context. This approach is scalable, efficient, and avoids the context window limitations of simply passing entire documents to an LLM. Option A is inefficient as it requires calculating averaged embeddings for entire documents and may lose important details. Option B relies on LLM-generated summaries which could introduce inaccuracies and misses the granular retrieval benefits. Option C, while potentially useful for recommendation systems, is overly complex for this use case and requires historical interaction data that may not be available initially. The community discussion confirms D as the simplest and most appropriate high-level architecture.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A Generative AI Engineer is tasked with designing an LLM-based application to answer employee HR questions by leveraging HR PDF documentation. What is the correct sequence of high-level tasks for this system?
A
Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee.
B
Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user.
C
Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved.
D
Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.