Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Cloud Networking, Google Compute Engine, VMWare engine, Google Cloud SQL, Google Kubernetes Engine

Customers may experience traffic loss across multiple products with requests destined to and from us-west2

Incident began at 2021-05-04 15:35 and ended at 2021-05-04 21:08 (all times are US/Pacific).

Date Time Description
6 May 2021 12:12 PDT

Following is the Incident Report for the networking outage occurred on May 4th 2021.

(All Times US/Pacific)

Incident Start: 2021-05-04 15:35

Incident End: 2021-05-04 21:08

Duration:. 5 hours, 33 minutes

Affected Services: Google Cloud Networking, Google Compute Engine (GCE), Google Cloud VMWare Engine, Cloud SQL and Google Kubernetes Engine (GKE)

Features: Cloud VPN, Cloud Interconnect, Google Private Access

Regions/Zones: us-west2

Description: Google Cloud Platform experienced an outage affecting network traffic in region us-west2 for a duration of 5 hours and 33 minutes. This impacted Internet and Cloud Interconnect connectivity to/from us-west2, including traffic between GCE VMs in the region and Internet endpoints, VM-to-VM traffic over Public IPs, External Network Load Balancing, Cloud VPN Classic (non-HA), and Cloud Interconnect. Cloud VPN HA was not impacted.

Root cause and mitigation:

The root cause was a rollout that changed some internal network settings on machines which handle internet routing to Cloud Services. Machines which received the change were unable to receive network programming information. The change caused new TCP connections to establish successfully, but dropped some packets sent between the Control and Data plane (Maglev[1]). Maglevs route traffic from public IPs and interconnects to various endpoints such as Cloud VPN tasks, individual instances, and groups of instances. When a Maglev task first starts, it must be programmed in order to start routing traffic. As independent Maglev Control and Dataplane rollouts restarted tasks, their long-standing TCP connections were reset, and the newly established connections were unable to exchange programming messages. This was mitigated by rolling back the configuration change once the root cause was identified. [1] https://research.google/pubs/pub44824

Customer Impact:

  • There were two impact windows, with minor impact from 15:35 to 17:50, and major impact from 17:50 to 21:08.

  • GCE and GKE experienced up to 50% egress and ingress traffic loss in us-west2 for traffic over public IP. Internal IP traffic was not affected.

  • Cloud VPN Classic gateways experienced failure of up to 50% of tunnels.

  • Cloud HA VPN experienced no loss of connectivity for customers with properly configured redundant interfaces. Interface 0 of HA VPN gateways experienced up to 70% tunnel failure. Interface 1 was unaffected.

  • Cloud Interconnect experienced ~50% loss.

  • Google Cloud VMWare Engine was impacted where control plane services in the us-west2 region were not reachable by customers. In addition some customers whose VMWare Private Clouds were in the region may have experienced issues with some VMWare services not connecting to their cloud / on-premise networks and/or services.

  • Google services accessed over Interconnect or VPN (e.g. Google Private Access) would have experienced similar loss.

  • Customers using CloudSQL instances in us-west2 may have experienced failed queries or failed connection attempts during the outage.

Additional Details:

  • The issue was mitigated by performing a full rollback of the changes.
  • We are working on improving the testbeds for this type of changes, as well as improvements of our monitoring and visibility for this type of interactions
4 May 2021 21:50 PDT

The issue with Cloud Networking has been resolved for all affected users as of approximately Tuesday, 2021-05-04 21:15 US/Pacific.

Customers affected by this issue observed traffic loss and were unable to reach VPN or Interconnect gateways from and to resources in us-west2 between 2021-05-04 17:37 to 21:15 US/Pacific.

The following products were impacted:

Google Compute Engine/Google Kubernetes Engine (Any resources/products using these products may also be impacted): May experience high traffic loss and connections errors from and to us-west2 over public IP. Internal IP traffic should continue work as normal.

Cloud Interconnect/Cloud VPN: May be unable to reach the gateway and high traffic loss.

Google Private Access: May see high packet loss.

Google Compute VMWare Engine Some instances may have entered a 'down' state.

We thank you for your patience while we worked on resolving the issue.

4 May 2021 20:36 PDT

Summary: Customers may experience traffic loss across multiple products with requests destined to and from us-west2

Description: Our engineering team continues their investigation into this issue.

Affected customers will see traffic loss and may be unable to reach VPN or Interconnect gateways from and to resources in us-west2 beginning at, Tuesday, 2021-05-04 17:37 US/Pacific.

The following products are currently impacted: Google Compute Engine/Google Kubernetes Engine - May see errors with connections to and from us-west2 over public IP. Internal IP traffic should continue work as normal. Cloud Interconnect/Cloud VPN - May see some session disconnects and high traffic loss and Google Private Access - May see high packet loss.

We will provide an update by Tuesday, 2021-05-04 23:30 US/Pacific with current details.

Diagnosis: None at this time.

Workaround: None at this time.

4 May 2021 19:03 PDT

Summary: Customers may experience traffic loss across multiple products with requests destined to and from us-west2

Description: We are experiencing an intermittent issue with Cloud Networking beginning at Tuesday, 2021-05-04 17:37 US/Pacific.

The following products are currently impacted: Google Compute Engine may see errors with connections to and from us-west2 over public IP. Internal IP traffic should continue work as normal. Cloud Interconnect - May see some session disconnects Cloud VPN - May see some session disconnects.

Our engineering team continues to investigate the issue.

We will provide an update by Tuesday, 2021-05-04 20:30 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: None at this time.

Workaround: None at this time.