Service Health
Incident affecting Google Kubernetes Engine, Google Compute Engine, Cloud Build
Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept.
Incident began at 2021-09-27 10:17 and ended at 2021-09-29 06:52 (all times are US/Pacific).
Date | Time | Description | |
---|---|---|---|
| 29 Sep 2021 | 11:32 PDT | We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support (All Times US/Pacific) Incident Start: 22 September 2021 00:00 Incident End: 29 September 2021 05:40 Duration: 7 days, 5 hours, 40 minutes Affected Services and Features:
Regions/Zones: Global Description: Google Kubernetes Engine, Google Compute Engine, and Google Cloud build experienced connection failures in Docker workloads to Google Cloud Load Balancers (GCLB) and destinations hosted behind content distribution networks (CDN’s) with a specific network configuration, such as debian.org and github.com. From preliminary analysis, the root cause of the issue was a rollout to the network virtualization stack, which inadvertently broke mechanisms to tolerate Maximum Transmission Unit (MTU) mismatches. This change, in conjunction with with a mismatch for the MTU between nested Docker containers and Virtual Machines (VM’s), in combination with how the GCP networking subsystem handles large packets, resulted in packet drops leading to timeouts and connection errors for Docker workloads. Customer Impact: Google Kubernetes Engine: Increased timeout and “Connection Failed” errors when connecting to GCLB or destinations hosted behind content distribution networks (CDN’s) with a specific network configuration. Google Compute Engine: Increased timeout and “Connection Failed” errors when connecting to GCLB or destinations hosted behind content distribution networks (CDN’s) with a specific network configuration for Docker workloads running on GCE. Google Cloud Build: Increased build failures for builds which fetch sources from repositories hosted behind content distribution networks (CDN’s) with a specific network configuration. |
| 29 Sep 2021 | 06:52 PDT | The issue with Google Kubernetes Engine has been resolved for all affected users as of Wednesday, 2021-09-29 06:52 US/Pacific. We thank you for your patience while we worked on resolving the issue. |
| 29 Sep 2021 | 03:33 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept. Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on final stages of completion. The mitigation is expected to complete by Wednesday, 2021-09-29 07:00 US/Pacific We will provide more information by Wednesday, 2021-09-29 08:00 US/Pacific. Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 29 Sep 2021 | 00:31 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept. Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on final stages of completion. The mitigation is expected to complete by Wednesday, 2021-09-29 03:00 US/Pacific We will provide more information by Wednesday, 2021-09-29 03:30 US/Pacific. Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 22:00 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept. Description: Mitigation work is still underway by our engineering team. The rollback in in progress and on track towards completion. The mitigation is expected to complete by Wednesday, 2021-09-29 00:00 US/Pacific We will provide more information by Wednesday, 2021-09-29 00:30 US/Pacific. Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 17:46 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Mitigation in progress. ETA - 00:00 PST, 29th Sept. Description: Mitigation work is still underway by our engineering team. The mitigation is expected to complete by Wednesday, 2021-09-29 00:00 US/Pacific We will provide more information by Tuesday, 2021-09-28 22:00 US/Pacific. Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 13:20 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Description: Mitigation work is still underway by our engineering team. The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific. Diagnosis: Some customers may be experiencing connection failures in Docker workflow to GCLB, GCB, or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 12:20 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Description: Mitigation work is still underway by our engineering team. The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific. Diagnosis: Some customers might experience a connection failure in Docker workflow to GCLB or Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 12:02 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Description: Mitigation work is still underway by our engineering team. The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific We will provide more information by Tuesday, 2021-09-28 18:00 US/Pacific. Diagnosis: Some customers might experience a connection failure in Docker workflow to Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
| 28 Sep 2021 | 06:05 PDT | Summary: Global: We have identified a Networking connectivity issue that impacts Docker workloads inside GKE and potentially GCE. Description: Mitigation work is currently underway by our engineering team. The mitigation is expected to complete by Thursday, 2021-09-30 18:00 US/Pacific We will provide more information by Tuesday, 2021-09-28 12:00 US/Pacific. Diagnosis: Some customers might experience a connection failure in Docker workflow to Fastly destinations such as debian.org, github.com with a timeout error. Workaround: If you are impacted, please try adding an init container manifest into docker in docker deployment.This will ensures packets are sent with a proper MTU that will work with Fastly destinations. |
- All times are US/Pacific