Databricks Certified Generative AI Engineer - Associate

Ultimate access to all questions.

A Generative AI Engineer has built an LLM application using a pay-per-token Foundation Model API. As they prepare for production deployment, how can they ensure the model endpoint can handle high volumes of incoming requests?

Exam-Like

Last updated: February 19, 2026 at 14:03

Switch to using External Models instead

1.2%

Throttle the incoming batch of requests manually to avoid rate limiting issues

6.0%

Loading comments...