Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Compute Engine

Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Incident began at 2023-04-21 02:05 and ended at 2023-04-21 15:13 (all times are US/Pacific).

Previously affected location(s)

Taiwan (asia-east1)Hong Kong (asia-east2)Tokyo (asia-northeast1)Osaka (asia-northeast2)Seoul (asia-northeast3)Mumbai (asia-south1)Delhi (asia-south2)Singapore (asia-southeast1)Jakarta (asia-southeast2)Sydney (australia-southeast1)Melbourne (australia-southeast2)Warsaw (europe-central2)Finland (europe-north1)Madrid (europe-southwest1)Belgium (europe-west1)Turin (europe-west12)London (europe-west2)Frankfurt (europe-west3)Netherlands (europe-west4)Zurich (europe-west6)Milan (europe-west8)Paris (europe-west9)Tel Aviv (me-west1)Montréal (northamerica-northeast1)Toronto (northamerica-northeast2)São Paulo (southamerica-east1)Santiago (southamerica-west1)Iowa (us-central1)South Carolina (us-east1)Northern Virginia (us-east4)Columbus (us-east5)Dallas (us-south1)Oregon (us-west1)Los Angeles (us-west2)Salt Lake City (us-west3)Las Vegas (us-west4)

Date Time Description
24 Apr 2023 12:01 PDT

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support.

(All Times US/Pacific)

Incident Start: 21 April 2023 02:05

Incident End: 21 April 2023 15:13

Duration: 13 hours, 8 minutes

Affected Services and Features:

Google Compute Engine Control Plane

Regions/Zones: Global

Description:

Google Compute Engine Control Plane health checks failed for any changes made to newly added health checks for a duration of 13 hours and 8 minutes. Preliminary analysis showed a recent network configuration change caused the issue.

Customer Impact:

During the incident the following GCE actions failed:

  • Any activity that reuses an existing health check, including Instance Groups, and directs it to a new/different Virtual Machine instance.
  • Altering an existing health check.
  • Calling gcloud compute instance-groups managed wait-until --stable on a newly created instance group with an existing health check would fail/timeout as the Instance Group Manager would likely not reach a stable state.
21 Apr 2023 15:18 PDT

The issue with Google Compute Engine has been resolved for all affected projects as of Friday, 2023-04-21 15:13 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

21 Apr 2023 12:29 PDT

Summary: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Description: Mitigation work is taking longer than expected and is still underway by our engineering team.

The mitigation is expected to complete by Friday, 2023-04-21 15:00 US/Pacific.

We will provide more information by Friday, 2023-04-21 15:30 US/Pacific.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups. Symptoms: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.

21 Apr 2023 07:34 PDT

Summary: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Description: Mitigation work is taking longer than expected and is still underway by our engineering team.

The mitigation is expected to complete by Friday, 2023-04-21 12:30 US/Pacific.

We will provide more information by Friday, 2023-04-21 13:00 US/Pacific.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups. Symptoms: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.

21 Apr 2023 06:03 PDT

Summary: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Description: Mitigation work is still underway by our engineering team.

We will provide more information by Friday, 2023-04-21 08:15 US/Pacific.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups. Symptoms: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.

21 Apr 2023 05:05 PDT

Summary: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Description: Mitigation work is currently underway by our engineering team.

We do not have an ETA for mitigation at this point.

We will provide more information by Friday, 2023-04-21 06:15 US/Pacific.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups. Symptoms: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.

21 Apr 2023 04:06 PDT

Summary: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state

Description: We are experiencing an issue with Google Compute Engine beginning at Friday, 2023-04-21 02:00 US/Pacific.

Our engineering team is still investigating the issue and working on a mitigation plan.

We will provide an update by Friday, 2023-04-21 05:15 US/Pacific with current details.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups in europe-west12-a europe-west12-b europe-west12-c. Symptoms: Managed Instance Groups, that rely on health checks to restart unhealthy VMs, may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.

21 Apr 2023 03:05 PDT

Summary: Managed Instance Groups that rely on health checks to restart unhealthy VMs may remain in an unhealthy state

Description: We are experiencing an issue with Google Compute Engine beginning at Friday, 2023-04-21 02:00 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Friday, 2023-04-21 04:15 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: Customers impacted: any customer that uses Managed Instance Groups and relies on health checks to keep the instances within in a healthy state. Symptoms: Managed Instance Groups that rely on health checks to restart unhealthy VMs may remain in an unhealthy state. The issue should not impact running VMs. The issue should be occurring only if the VM reaches an unhealthy state. The automation that would usually fix it may not be working.

Workaround: None at this time.