
Answer-first summary for fast verification
Answer: Redirect traffic away from the affected region to the other two regions while addressing the patching error.
Options A and B are not optimal as they either attempt to fix the problem without certainty or increase the Mean Time to Repair (MTTR). Option D, while it may seem viable, does not immediately mitigate user impact as effectively as redirecting traffic. Option C is the correct choice because it effectively minimizes user impact by diverting traffic to functioning regions, allowing time to properly address the patching error in the affected region. This approach aligns with best practices for troubleshooting and maintaining system reliability under adverse conditions. References: [Google SRE Book on Effective Troubleshooting](https://sre.google/sre-book/effective-troubleshooting/), [Google Cloud Documentation on Connection Draining](https://cloud.google.com/load-balancing/docs/enabling-connection-draining).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Your Site Reliability Engineering (SRE) team is managing an application deployed across three regions, utilizing Managed Instance Groups behind a global HTTP(S) Load Balancer. During the application of a critical security patch to Compute Engine instances, an error in the third region causes requests to fail. How can you minimize the impact on users from this unsuccessful patching?
A
Increase the number of instances in the affected region to handle the load.
B
Attempt to restart all instances in the affected region to resolve the issue.
C
Redirect traffic away from the affected region to the other two regions while addressing the patching error.
D
Reverse the changes made in the third region to revert to the previous state.