Ultimate access to all questions.
You have recently deployed a machine learning model to a Vertex AI endpoint and configured online serving using Vertex AI Feature Store. Additionally, you have set up a daily batch ingestion job to update your Feature Store with new data. However, you notice that during the batch ingestion jobs, the CPU utilization is high on your Feature Store's online serving nodes, leading to increased feature retrieval latency. How can you improve the online serving performance and reduce latency during these batch ingestion jobs?
Explanation:
The correct answer is B: Enable autoscaling of the online serving nodes in your featurestore. Enabling autoscaling ensures that the number of online serving nodes can automatically adjust to the traffic demand, thus improving online serving performance and reducing feature retrieval latency during the daily batch ingestion. This approach is more flexible and responsive compared to manually scheduling node increases or other alternatives, which may not directly address the issue of high CPU utilization and latency.