
Answer-first summary for fast verification
Answer: Ingest documents from a source -> Index the documents and save to Vector Search -> User submits queries against an LLM -> LLM retrieves relevant documents -> LLM generates a response -> Evaluate model -> Deploy it using Model Serving
The correct sequence for building and deploying a RAG application follows a logical workflow: first, ingest and index documents to create the knowledge base (Vector Search); then, users can submit queries where the LLM retrieves relevant documents and generates responses; finally, the model must be evaluated on its performance before deployment. Option D correctly sequences these steps: document ingestion → indexing → query processing → retrieval → response generation → evaluation → deployment. The community discussion strongly supports D (80% consensus) with valid reasoning that evaluation must occur after response generation to assess output quality, safety, and performance. Options A, B, and C are incorrect because they either place evaluation before response generation (A, C) or disrupt the logical flow by having queries before document ingestion/indexing (B).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A Generative AI Engineer is developing a RAG application to answer employee questions about company policies.
What are the necessary steps to build and deploy this RAG application?
A
Ingest documents from a source -> Index the documents and saves to Vector Search -> User submits queries against an LLM -> LLM retrieves relevant documents -> Evaluate model -> LLM generates a response -> Deploy it using Model Serving
B
User submits queries against an LLM -> Ingest documents from a source -> Index the documents and save to Vector Search -> LLM retrieves relevant documents -> LLM generates a response -> Evaluate model -> Deploy it using Model Serving
C
Ingest documents from a source -> Index the documents and save to Vector Search -> Evaluate model -> Deploy it using Model Serving -> User submits queries against an LLM -> LLM retrieves relevant documents -> LLM generates a response
D
Ingest documents from a source -> Index the documents and save to Vector Search -> User submits queries against an LLM -> LLM retrieves relevant documents -> LLM generates a response -> Evaluate model -> Deploy it using Model Serving