Google Professional Machine Learning Engineer

Ultimate access to all questions.

You are an ML researcher at an investment bank experimenting with the Gemma large language model (LLM) for an internal use case. You require full control over the model's underlying infrastructure and need to minimize the model's inference time. Which serving configuration should you use?

Exam-Like

Deploy the model on a Vertex AI endpoint manually by creating a custom inference container.

33.3%

Deploy the model on a Google Kubernetes Engine (GKE) cluster by using the deployment options in Model Garden.

Loading comments...

Deploy the model on a Google Kubernetes Engine (GKE) cluster manually by cresting a custom yaml manifest.

22.2%