
Ultimate access to all questions.
Question: 17 You are working with a Retrieval-Augmented Generation (RAG) application that uses a large language model (LLM) to generate responses. The cost of running this application is increasing due to high usage of the LLM for inference. What is the most effective way to use Databricks features to control costs without compromising the quality of responses?
Explanation:
Option B is the correct answer because it directly addresses the cost issue caused by high LLM inference usage while maintaining response quality.
This solution leverages Databricks' capabilities for caching and prompt management to achieve significant cost reductions while preserving the application's effectiveness.