
Answer-first summary for fast verification
Answer: Purchase Provisioned Throughput for the model on Amazon Bedrock.
## Detailed Explanation This question involves deploying a custom fine-tuned model from Amazon Bedrock to production with a **steady, consistent rate of requests per minute**. The key requirements are: 1. **Production deployment** of a custom fine-tuned model 2. **Steady workload** (not bursty or unpredictable) 3. **Most cost-effective** solution ### Analysis of Each Option: **A: Deploy the model by using an Amazon EC2 compute optimized instance.** - This involves managing infrastructure (EC2 instances) to host the model. - While technically possible, it requires significant operational overhead for provisioning, scaling, monitoring, and maintenance. - For a steady workload, you would need to provision and pay for EC2 capacity continuously, which may not be optimal compared to managed services. - **Not the most cost-effective** due to infrastructure management costs and less efficient pricing for consistent usage patterns. **B: Use the model with on-demand throughput on Amazon Bedrock.** - On-demand throughput charges per token processed, making it suitable for **unpredictable or low-volume workloads**. - However, for **custom fine-tuned models on Amazon Bedrock, on-demand mode is typically not available** for production use. - Even if available, pay-per-token pricing would be less cost-effective than reserved capacity for a **steady, predictable workload**. - **Not optimal** due to potential unavailability for custom models and higher variable costs for consistent usage. **C: Store the model in Amazon S3 and host the model by using AWS Lambda.** - This approach involves serverless deployment via AWS Lambda. - While Lambda can handle inference, it has limitations for large language models: - **Cold start latency** can be significant for LLMs. - **Memory and timeout constraints** (15-minute maximum execution time, up to 10GB memory). - Not designed for continuous, steady inference workloads with large models. - **Not suitable** for production deployment of custom LLMs with steady request rates due to performance and scalability limitations. **D: Purchase Provisioned Throughput for the model on Amazon Bedrock.** - **Provisioned Throughput reserves dedicated model capacity** (model units) for a custom fine-tuned model. - It provides **guaranteed, predictable throughput** ideal for steady workloads. - **Most cost-effective** for consistent usage because: - Offers **discounted pricing** compared to on-demand pay-per-token models. - Eliminates the operational overhead of managing infrastructure. - Specifically designed for **production deployment of custom models** on Amazon Bedrock. - Aligns perfectly with the requirements: production deployment, steady request rate, and cost-effectiveness. ### Conclusion: Option **D** is the optimal choice because it directly addresses all requirements: it enables production deployment of custom fine-tuned models on Amazon Bedrock, provides predictable capacity for steady workloads, and offers the most cost-effective pricing through reserved throughput. The other options either lack support for custom models, incur higher costs for consistent usage, or introduce operational complexities that reduce cost-effectiveness.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
A company has fine-tuned a custom model using an existing large language model (LLM) from Amazon Bedrock. They need to deploy this model to production to serve a consistent, steady rate of requests per minute.
What is the most cost-effective solution to meet these requirements?
A
Deploy the model by using an Amazon EC2 compute optimized instance.
B
Use the model with on-demand throughput on Amazon Bedrock.
C
Store the model in Amazon S3 and host the model by using AWS Lambda.
D
Purchase Provisioned Throughput for the model on Amazon Bedrock.