Google Professional Cloud DevOps Engineer

Ultimate access to all questions.

You manage a production service running on a single Compute Engine instance. Currently, you frequently spend time manually recreating the service by deleting the crashed instance and launching a new one from the appropriate image. To minimize manual intervention while adhering to Site Reliability Engineering (SRE) principles, what should you do?

Exam-Like

File a bug with the development team so they can find the root cause of the crashing instance.

Create a Managed instance Group with a single instance and use health checks to determine the system status.

Loading comments...

Create a Stackdriver Monitoring dashboard with SMS alerts to be able to start recreating the crashed instance promptly after it was crashed.