Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Kubernetes Engine

Google Kubernetes Engine customers with Workload Identity enabled may see high application logging rate

Incident began at 2023-11-02 10:45 and ended at 2023-11-10 13:37 (all times are US/Pacific).

Previously affected location(s)

Taiwan (asia-east1)Hong Kong (asia-east2)Tokyo (asia-northeast1)Osaka (asia-northeast2)Seoul (asia-northeast3)Mumbai (asia-south1)Delhi (asia-south2)Singapore (asia-southeast1)Jakarta (asia-southeast2)Sydney (australia-southeast1)Melbourne (australia-southeast2)Warsaw (europe-central2)Finland (europe-north1)Madrid (europe-southwest1)Belgium (europe-west1)Berlin (europe-west10)Turin (europe-west12)London (europe-west2)Frankfurt (europe-west3)Netherlands (europe-west4)Zurich (europe-west6)Milan (europe-west8)Paris (europe-west9)Doha (me-central1)Dammam (me-central2)Tel Aviv (me-west1)Montréal (northamerica-northeast1)Toronto (northamerica-northeast2)São Paulo (southamerica-east1)Santiago (southamerica-west1)Iowa (us-central1)South Carolina (us-east1)Northern Virginia (us-east4)Columbus (us-east5)Dallas (us-south1)Oregon (us-west1)Los Angeles (us-west2)Salt Lake City (us-west3)Las Vegas (us-west4)

Date Time Description
10 Nov 2023 13:37 PST

The issue with Google Kubernetes Engine has been resolved for all affected users as of Thursday, 2023-11-09 23:00 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

8 Nov 2023 14:06 PST

Summary: Google Kubernetes Engine customers with Workload Identity enabled may see high application logging rate

Description: There is no impact to cluster functionality or performance. The only impact is extra logs in Cloud Logging.

gke-metadata-server is a GKE-managed system workload that is part of the GKE Workload Identity feature. Versions 0.4.272 to 0.4.280 of gke-metadata-server contain an incorrect configuration that results in a high rate of debug logs that contain the string "Unable to sync sandbox". These logs are then ingested into Cloud Logging, consuming Cloud Logging ingestion quota, and causing excess billable usage when exceeding the free monthly allotment.

A rollout containing a fix to no longer ingest the excess logs to Cloud Logging is about 25% complete.

We will provide an update by Friday, 2023-11-10 14:00 US/Pacific.

Diagnosis: Customers can determine whether their cluster is impacted by inspecting the gke-metadata-server daemonset with kubectl get daemonset -n kube-system -l k8s-app=gke-metadata-server -o yaml and looking at the components.gke.io/component-version annotation in .spec.template.metadata.annotations. If the value is a version between 0.4.272 and 0.4.280 (inclusive), then the cluster is currently affected.

Workaround:

  • Customers using GKE Rapid Channel can upgrade their cluster control plane to 1.28.2-gke.1157000 and above, or 1.27.7-gke.1038000 and above.
  • Customers on GKE Regular Channel, GKE Stable Channel, or who are not using release channels do not have a workaround at this time.
6 Nov 2023 13:47 PST

Summary: Google Kubernetes Engine customers with Workload Identity enabled may see high application logging rate

Description: There is no impact to cluster functionality or performance. The only impact is extra logs in Cloud Logging.

gke-metadata-server is a GKE-managed system workload that is part of the GKE Workload Identity feature. Versions 0.4.272 to 0.4.280 of gke-metadata-server contain an incorrect configuration that results in a high rate of debug logs that contain the string "Unable to sync sandbox". These logs are then ingested into Cloud Logging, consuming Cloud Logging ingestion quota, and causing excess billable usage when exceeding the free monthly allotment.

The root cause has been identified and we are starting the rollout of a fix across the fleet.

We will provide more information by Wednesday, 2023-11-08 14:00 US/Pacific.

Diagnosis: Customers can determine whether their cluster is impacted by inspecting the gke-metadata-server daemonset with kubectl get daemonset -n kube-system -l k8s-app=gke-metadata-server -o yaml and looking at the components.gke.io/component-version annotation in .spec.template.metadata.annotations. If the value is a version between 0.4.272 and 0.4.280 (inclusive), then the cluster is currently affected.

Workaround:

  • Customers using GKE Rapid Channel can upgrade their cluster control plane to 1.28.2-gke.1157000 and above, or 1.27.7-gke.1038000 and above
  • Customers on GKE Regular Channel, GKE Stable Channel, or who are not using release channels do not have a workaround at this time.
3 Nov 2023 14:08 PDT

Summary: Google Kubernetes Engine customers using gke-metadata-server versions 0.4.272 to 0.4.280 may see high application logging rate

Description: There is no impact to cluster functionality or performance. The only impact is extra logs in Cloud Logging.

gke-metadata-server is a GKE-managed system workload that is part of the GKE Workload Identity feature. Versions 0.4.272 to 0.4.280 of gke-metadata-server contain an incorrect configuration that results in a high rate of debug logs that contain the string "Unable to sync sandbox". These logs are then ingested into Cloud Logging, consuming Cloud Logging ingestion quota, and causing excess billable usage when exceeding the free monthly allotment.

The root cause has been identified and we are working on rolling out a fix across the fleet.

Customers can determine whether their cluster is impacted by inspecting the gke-metadata-server daemonset with kubectl get daemonset -n kube-system -l k8s-app=gke-metadata-server -o yaml and looking at the components.gke.io/component-version annotation in .spec.template.metadata.annotations. If the value is a version between 0.4.272 and 0.4.280 (inclusive), then the cluster is currently affected.

We will provide more information by Monday, 2023-11-06 14:00 US/Pacific.

Diagnosis: * Impact is limited to gke-metadata-server versions 0.4.272 to 0.4.280

  • gke-metadata-server and customer applications that depend on it should continue to work.
  • gke-metadata-server may be exhausting their Cloud Logging quota.

Workaround:

  • Customers using GKE Rapid Channel can upgrade their cluster control plane to 1.28.2-gke.1157000 or above.
  • Customers on GKE Regular Channel, GKE Stable Channel, or who are not using release channels do not have a workaround at this time.
2 Nov 2023 14:09 PDT

Summary: Google Kubernetes Engine customers using gke-metadata-server versions 0.4.272 to 0.4.280 may see high application logging rate

Description: gke-metadata-server is a GKE-managed system workload that is part of the GKE Workload Identity feature. This feature is opt-in on GKE Standard, and always enabled on GKE Autopilot.

Versions 0.4.272 to 0.4.280 of gke-metadata-server contain a bug that results in a high rate of debug logs that contain the string "Unable to sync sandbox". These logs are then ingested into Cloud Logging, consuming Cloud Logging ingestion quota, and causing excess billable usage when exceeding the free monthly allotment.

gke-metadata-server and customer applications that depend on it should continue to work, with some increased latency on metadata server calls.

The root cause has been identified and we are working on rolling out a fix across the fleet. Currently, a fixed gke-metadata-server version is available on Rapid channel, by upgrading your cluster's control plane to 1.28.2-gke.1157000 or above.

You can determine whether your cluster is impacted by inspecting the gke-metadata-server daemonset with kubectl get daemonset -n kube-system -l k8s-app=gke-metadata-server -o yaml and looking at the components.gke.io/component-version annotation in .spec.template.metadata.annotations. If the value is a version between 0.4.272 and 0.4.280 (inclusive), then your cluster is currently affected.

We will provide more information by Friday, 2023-11-03 14:00 US/Pacific.

Diagnosis: * Affected clusters should continue to work, with possible reduced service from gke-metadata-server because it is spending CPU on the excessive logging.

  • gke-metadata-server may be exhausting their Cloud Logging quota.

Workaround: None at this time.

2 Nov 2023 13:32 PDT

Summary: Google Kubernetes Engine customers using gke-metadata-server versions 0.4.272 to 0.4.280 may see high application logging rate

Description: We are experiencing an issue with Google Kubernetes Engine.

Our engineering team continues to investigate the issue.

We will provide an update by Thursday, 2023-11-02 15:00 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: None at this time.

Workaround: None at this time.