Google Professional Cloud DevOps Engineer

Google Professional Cloud DevOps Engineer

Get started today

Ultimate access to all questions.


As a member of an on-call Site Reliability Engineering team overseeing a web application in production, you encounter a situation where users from a specific region are reporting errors and failed requests following a recent update. After declaring an incident and assessing the impact, which action should you prioritize?




Explanation:

The correct course of action is to first mitigate the impact on users, as this ensures service continuity while further investigations or fixes are underway. Options A, B, and D are steps that follow after ensuring the service is stable. Mitigation is crucial in the immediate response to an incident to minimize user disruption. Reference: Google SRE Workbook on Incident Response (Case Study 2).