
Ultimate access to all questions.
After switching the response generation LLM in a RAG pipeline from GPT-4 to a self-hosted model with a shorter context length, the following error occurs: //IMG//
Without changing the response generation model, which TWO solutions should be implemented? (Choose two.)

A
Use a smaller embedding model to generate embeddings
B
Reduce the maximum output tokens of the new model
C
Decrease the chunk size of embedded documents
D
Reduce the number of records retrieved from the vector database
E
Retrain the response generating model using ALiBi