
Answer-first summary for fast verification
Answer: Investigate the issue, and if it continues, assign an incident commander.
Option A is not the first step because the exact nature of the problem isn't yet known. Option B is incorrect as it narrowly focuses on the technical fix without considering the broader incident management process. Option C is premature; root cause analysis is important but comes after ensuring the application is back online. Option D is correct as it aligns with Google's recommended approach: first investigate the issue, and if it persists, appoint an incident commander to oversee the resolution process. Reference: https://sre.google/sre-book/managing-incidents/
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
While on-call for managing a production application, you receive alerts indicating the application is failing uptime checks. According to SRE best practices for incident management, what is the first action you should take?
A
Inform your team lead immediately.
B
Jump straight into fixing the issue.
C
Conduct a root cause analysis to understand the problem.
D
Investigate the issue, and if it continues, assign an incident commander.