Ultimate access to all questions.
You manage a production service running on a single Compute Engine instance. Currently, you frequently spend time manually recreating the service by deleting the crashed instance and launching a new one from the appropriate image. To minimize manual intervention while adhering to Site Reliability Engineering (SRE) principles, what should you do?