
Answer-first summary for fast verification
Answer: Create a Managed instance Group with a single instance and use health checks to determine the system status.
To reduce the time spent on manual operations and adhere to Site Reliability Engineering principles, the best approach is to automate the recovery process of the crashing instance. Option B suggests creating a Managed Instance Group (MIG) with a single instance and using health checks. This is the correct approach because MIGs automatically recreate instances that fail health checks, thus automating the recovery process without manual intervention. Option A is reactive and does not immediately reduce manual operations. Option C, while it introduces a Load Balancer, does not address the automation of instance recovery. Option D improves monitoring but still requires manual intervention for recovery. Therefore, the correct answer is B.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You manage a production service running on a single Compute Engine instance. Currently, you frequently spend time manually recreating the service by deleting the crashed instance and launching a new one from the appropriate image. To minimize manual intervention while adhering to Site Reliability Engineering (SRE) principles, what should you do?
A
File a bug with the development team so they can find the root cause of the crashing instance.
B
Create a Managed instance Group with a single instance and use health checks to determine the system status.
C
Add a Load Balancer in front of the Compute Engine instance and use health checks to determine the system status.
D
Create a Stackdriver Monitoring dashboard with SMS alerts to be able to start recreating the crashed instance promptly after it was crashed.
No comments yet.