Google Cloud Status Dashboard

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Kubernetes Engine

Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.

Incident began at 2021-09-27 10:17 and ended at 2021-09-29 06:52 (all times are US/Pacific).

Date Time Description
29 Sep 2021 11:32 PDT

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support

(All Times US/Pacific)

Incident Start: 22 September 2021 00:00

Incident End: 29 September 2021 05:40

Duration: 7 days, 5 hours, 40 minutes

Affected Services and Features:

  • Google Kubernetes Engine (GKE)
  • Google Compute Engine (GCE)
  • Google Cloud Build (GCB)

Regions/Zones: Global

Description:

Google Kubernetes Engine, Google Compute Engine, and Google Cloud build experienced connection failures in Docker workloads to Google Cloud Load Balancers (GCLB) and destinations hosted behind content distribution networks (CDN’s) with a specific network configuration, such as debian.org and github.com.

From preliminary analysis, the root cause of the issue was a rollout to the network virtualization stack, which inadvertently broke mechanisms to tolerate Maximum Transmission Unit (MTU) mismatches. This change, in conjunction with with a mismatch for the MTU between nested Docker containers and Virtual Machines (VM’s), in combination with how the GCP networking subsystem handles large packets, resulted in packet drops leading to timeouts and connection errors for Docker workloads.

Customer Impact:

Google Kubernetes Engine: Increased timeout and “Connection Failed” errors when connecting to GCLB or destinations hosted behind content distribution networks (CDN’s) with a specific network configuration. Google Compute Engine: Increased timeout and “Connection Failed” errors when connecting to GCLB or destinations hosted behind content distribution networks (CDN’s) with a specific network configuration for Docker workloads running on GCE. Google Cloud Build: Increased build failures for builds which fetch sources from repositories hosted behind content distribution networks (CDN’s) with a specific network configuration.

29 Sep 2021 06:52 PDT

The issue with Google Kubernetes Engine has been resolved for all affected users as of Wednesday, 2021-09-29 06:52 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

29 Sep 2021 03:33 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.

Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on final stages of completion.

The mitigation is expected to complete by Wednesday, 2021-09-29 07:00 US/Pacific

We will provide more information by Wednesday, 2021-09-29 08:00 US/Pacific.

Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

29 Sep 2021 00:31 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.

Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on final stages of completion.

The mitigation is expected to complete by Wednesday, 2021-09-29 03:00 US/Pacific

We will provide more information by Wednesday, 2021-09-29 03:30 US/Pacific.

Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 22:00 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.

Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on track towards completion.

The mitigation is expected to complete by Wednesday, 2021-09-29 00:00 US/Pacific

We will provide more information by Wednesday, 2021-09-29 00:30 US/Pacific.

Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 17:46 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.

Description: Mitigation work is still underway by our engineering team.

The mitigation is expected to complete by Wednesday, 2021-09-29 00:00 US/Pacific

We will provide more information by Tuesday, 2021-09-28 22:00 US/Pacific.

Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 13:20 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE.

Description: Mitigation work is still underway by our engineering team.

The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific

We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific.

Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 12:20 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE.

Description: Mitigation work is still underway by our engineering team.

The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific

We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific.

Diagnosis: Some customers might experience a connection failure in Docker workflow to GCLB or Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 12:02 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE.

Description: Mitigation work is still underway by our engineering team.

The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific

We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific.

Diagnosis: Some customers might experience a connection failure in Docker workflow to Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.

28 Sep 2021 06:05 PDT

Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE.

Description: Mitigation work is currently underway by our engineering team.

The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific

We will provide more information by Tuesday, 2021-09-28 12:00 US/Pacific.

Diagnosis: Some customers might experience a connection failure in Docker workflow to Fastly destinations such as debian.org, github.com with a timeout error.

Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations.