
Answer-first summary for fast verification
Answer: Redirect traffic away from the affected region to the other two regions while addressing the issue.
Options A and B are not ideal as they do not guarantee a resolution and may prolong the Mean Time to Repair (MTTR). Option C, while it may seem like a quick fix, does not immediately mitigate user impact. Option D is the correct approach as it effectively minimizes user impact by redirecting traffic to functioning regions, allowing time to properly address the patching error in the affected region. This strategy aligns with best practices for effective troubleshooting and maintaining system reliability under adverse conditions. For more information, refer to the Google Cloud documentation on enabling connection draining and the SRE book on effective troubleshooting.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your Site Reliability Engineering (SRE) team is managing an application deployed across three regions, utilizing Managed Instance Groups behind a global HTTP(S) Load Balancer. While applying a critical security patch to Compute Engine instances, the first two regions were successfully updated, but an error in the third region causes requests to fail. To minimize user impact from this unsuccessful patch, what is the best course of action?
A
Increase the number of instances in the affected region to handle the load.
B
Attempt to restart all instances in the region where the patch failed.
C
Immediately revert the changes made in the third region to its previous state.
D
Redirect traffic away from the affected region to the other two regions while addressing the issue.
No comments yet.