Google Professional Cloud DevOps Engineer

Get started today

Ultimate access to all questions.

Explanation:

The correct approach to managing an incident with an unhealthy node in a load balancer pool, following Google-recommended practices, involves ensuring minimal impact on users and maintaining service availability. Option A is correct because it emphasizes the importance of communication with the incident team, assessing the capacity of remaining nodes to handle the increased load before removing the unhealthy node, and ensuring new nodes are healthy before draining traffic from the unhealthy node. This approach minimizes risk by ensuring the service can handle the traffic without the unhealthy node before removing it. Option B is also correct as it similarly prioritizes communication, adds a new node to ensure capacity before removing the unhealthy node, and ensures the new node is healthy before serving traffic, thus maintaining service availability and minimizing user impact. Both options A and B adhere to best practices by focusing on communication, capacity planning, and ensuring node health before making changes to the load balancer pool.

Explanation:

Comments (0)

No comments yet.

As the Operations Lead handling an incident with your service, you observe that a single node is returning 5xx errors for all requests while the service typically operates at 70% capacity. Customer support cases have also increased. To minimize user impact and adhere to Google-recommended practices, how should you proceed to remove the faulty node from the load balancer pool for isolation and investigation?

Exam-Like

Communicate your intent to the incident team. 2. Perform a load analysis to determine if the remaining nodes can handle the increase in traffic offloaded from the removed node, and scale appropriately. 3. When any new nodes report healthy, drain traffic from the unhealthy node, and remove the unhealthy node from service.

43.8%

Communicate your intent to the incident team. 2. Add a new node to the pool, and wait for the new node to report as healthy. 3. When traffic is being served on the new node, drain traffic from the unhealthy node, and remove the old node from service.

37.5%

Drain traffic from the unhealthy node and remove the node from service. 2. Monitor traffic to ensure that the error is resolved and that the other nodes in the pool are handling the traffic appropriately. 3. Scale the pool as necessary to handle the new load. 4. Communicate your actions to the incident team.

12.5%

Drain traffic from the unhealthy node and remove the old node from service. 2. Add a new node to the pool, wait for the new node to report as healthy, and then serve traffic to the new node. 3. Monitor traffic to ensure that the pool is healthy and is handling traffic appropriately. 4. Communicate your actions to the incident team.

6.3%