Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Kubernetes Engine

us-central1-c: GKE experiencing issues with some cluster and nodepool operations. Mitigation Underway.

Incident began at 2021-10-20 02:00 and ended at 2021-10-20 09:30 (all times are US/Pacific).

Date Time Description
21 Oct 2021 14:55 PDT

We apologize for the inconvenience this service outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support

(All Times US/Pacific)

First Impact

Incident Start: 19 October 2021 11:40

Incident End: 19 October 2021 16:21

Duration: 4 hours, 41 minutes

Second Impact

Incident Start: 20 October 2021 02:00

Incident End: 20 October 2021 09:30

Duration: 7 hours, 30 minutes

Affected Services and Features:

Google Kubernetes Engine

Regions/Zones: us-central1-c

Description:

Google Kubernetes Engine experienced two impacts to operations, the first on 19 October 2021 and the second on 20 October 2021.

First Impact: Customers may have experienced up to 100% failure rate for create-cluster, delete-cluster, delete-nodepool operations and node-pool resizes in us-central1-c for 4 hours and 41 minutes. From preliminary analysis, the root cause was resource contention related to an unexpected increase in API operations. Engineers scaled up instances to mitigate the issue.

Second Impact: Customers may have experienced up to 80% failure for create-cluster, delete-cluster, delete-nodepool operations and node-pool resizes in us-central1-c for 7 hours and 30 minutes. From preliminary analysis, the root case was created by an unexpected increase of nodepool operations in a single customer cluster. Engineers mitigated the issue through additional quota enforcement.

Customer Impact:

Customers affected would have experienced 500/503 errors for create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes.

20 Oct 2021 15:38 PDT

The issue with GKE cluster and nodepool operations has been resolved for all affected projects as of Wednesday, 2021-10-20 15:21 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

20 Oct 2021 13:35 PDT

Summary: us-central1-c: GKE experiencing issues with some cluster and nodepool operations.

Description: Our engineering team continues to work on the resolution of the issue.

We do not have an ETA for full resolution at this point.

Customers should see improvement in the cluster and nodepool operations completion.

We will provide an update by Wednesday, 2021-10-20 15:30 US/Pacific with current details.

Diagnosis: The following operations may fail with 500/503 errors: create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes

Workaround: The failed operations may succeed on retrying them

20 Oct 2021 11:34 PDT

Summary: us-central1-c: GKE experiencing issues with some cluster and nodepool operations.

Description: We believe the issue with Google Kubernetes Engine is partially resolved.

Customers should see improvement in the cluster and nodepool operations completion.

We do not have an ETA for full resolution at this point.

We will provide an update by Wednesday, 2021-10-20 13:45 US/Pacific with current details.

Diagnosis: The following operations may fail with 500/503 errors: create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes

Workaround: The failed operations may succeed on retrying them

20 Oct 2021 10:13 PDT

Summary: us-central1-c: GKE experiencing issues with some cluster and nodepool operations.

Description: We believe the issue with Google Kubernetes Engine is partially resolved.

Customers should see improvement in the cluster and nodepool operations completion.

We do not have an ETA for full resolution at this point.

We will provide an update by Wednesday, 2021-10-20 11:45 US/Pacific with current details.

Diagnosis: The following operations may fail with 500/503 errors: create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes

Workaround: The failed operations may succeed on retrying them

20 Oct 2021 09:41 PDT

Summary: us-central1-c: GKE experiencing issues with some cluster and nodepool operations.

Description: We are experiencing an issue with Google Kubernetes Engine beginning at Wednesday, 2021-10-20 08:30 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Wednesday, 2021-10-20 10:15 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: The following operations may fail with 500/503 errors: create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes

Workaround: The failed operations may succeed on retrying them

20 Oct 2021 08:52 PDT

Summary: us-central1-c: GKE experiencing issues with some cluster and nodepool operations.

Description: We are experiencing an issue with Google Kubernetes Engine beginning at Wednesday, 2021-10-20 08:30 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Wednesday, 2021-10-20 09:42 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: The following operations may fail with 500/503 errors: create-cluster, delete-cluster, create-nodepool, delete-nodepool, and node-pool resizes

Workaround: The failed operations may succeed on retrying them