Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Cloud Networking, Hybrid Connectivity

Global : Cloud Networking faced severe packet loss

Incident began at 2022-05-20 13:47 and ended at 2022-05-20 14:07 (all times are US/Pacific).

Previously affected location(s)

Taiwan (asia-east1)Hong Kong (asia-east2)Tokyo (asia-northeast1)Osaka (asia-northeast2)Seoul (asia-northeast3)Mumbai (asia-south1)Delhi (asia-south2)Singapore (asia-southeast1)Jakarta (asia-southeast2)Sydney (australia-southeast1)Melbourne (australia-southeast2)Warsaw (europe-central2)Finland (europe-north1)Belgium (europe-west1)London (europe-west2)Frankfurt (europe-west3)Netherlands (europe-west4)Zurich (europe-west6)Montréal (northamerica-northeast1)Toronto (northamerica-northeast2)São Paulo (southamerica-east1)Santiago (southamerica-west1)Iowa (us-central1)South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)Los Angeles (us-west2)Salt Lake City (us-west3)Las Vegas (us-west4)

Date Time Description
2 Jun 2022 12:11 PDT

INCIDENT REPORT

SUMMARY:

On Friday, 20 May 2022 at 13:47 US/Pacific, Google Cloud Networking experienced intermittent packet loss for traffic between multiple cloud regions for a duration of 20 minutes. The issue was identified and mitigated automatically by 14:07 US/Pacific.

We understand this issue has affected our valued customers and users, and we apologize to those who were affected.

ROOT CAUSE:

Google’s production backbone is a global network that enables connectivity for all user-facing traffic via Points of Presence (POPs) or internet exchanges.

A failure of a component on a fiber path from one of the central US gateway campuses in Google’s production backbone led to a decrease in available network bandwidth between the gateway and multiple edge locations, causing packet loss while the backbone automatically moved traffic onto remaining paths.

The network topology in this region is being augmented, and the second-best path has not completed its augmentation. This meant some traffic needed to reroute onto the third-best path, which led to an extended traffic migration period. This disruption was more severe than we had anticipated, and is the subject of remediation actions below.

REMEDIATION AND PREVENTION:

Google’s automated repair mechanisms detected the decrease in available network bandwidth on Friday, 20 May 2022 at 13:47 US/Pacific and automatically routed the traffic through alternate links. The traffic rerouting completed on Friday, 20 May 2022 at 14:06 US/Pacific, mitigating the issue.

While our automated mechanisms worked as intended and recovered the traffic without manual intervention, we understand that the scope of impact caused by this event affected our customers.

We have been working on optimizing our global network to minimize the time spent automatically reconfiguring around failures like this (known as "convergence time"). While we have made progress, efforts to improve are ongoing. We continue to ensure that the current technology is optimally configured to minimize the frequency and severity of these issues.

In this network region, we intend to complete the augmentation by 11 July 2022, which will return the network to the intended topology where any single fiber path failure can be rerouted quickly onto the second-best path.

We will also build automatic analysis to ensure that the network topology during augmentation always supports fast convergence.

Google is committed to quickly and continually improving our technology and operations to prevent service disruptions. We appreciate your patience and apologize again for the impact to your organization. We thank you for your business.

DETAILED DESCRIPTION OF IMPACT:

  • Google Cloud Networking - Affected customers would have observed packet loss for ingress and egress traffic to and from US Central Cloud regions from 13:47 to 14:07 US/Pacific.

  • Cloud VPN - Affected customers would have experienced latency and errors for cross region VPN with global routing from 13:47 to 14:07 US/Pacific. Intra-region traffic for Classic and HA VPN would not have been affected.

  • Cloud Router - Affected customers would have observed delays in global routing propagation from 13:47 to 14:07 US/Pacific.

  • Cloud Interconnect - Affected customers would have experienced latency and errors from 13:47 to 14:07 US/Pacific.

  • Google Cloud Storage - Affected customers would have experienced elevated latency, HTTP 500 errors, or transient API errors from 13:47 to 14:17 US/Pacific.

  • Cloud SQL - Affected customers would have experienced failures for Export, Update, and Delete operations as well as delayed data replication in us-west1 from 13:53 to 14:13 US/Pacific.

  • Messages - Affected customers would have experienced service availability issues from 13:49 to 14:05 US/Pacific.

23 May 2022 14:33 PDT

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case https://cloud.google.com/support or help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

Incident Start: 20 May 2022 13:47

Incident End: 20 May 2022 14:07

Duration: 20 minutes

Affected Services and Features:

  • Google Cloud Networking
  • Cloud VPN
  • Cloud Router
  • Cloud Interconnect
  • Google Cloud Storage
  • Cloud SQL
  • Messages

Regions/Zones: Multiple regions

Description:

Google Cloud Networking experienced intermittent packet loss for some transit traffic affecting inter-region connectivity for multiple cloud regions, which lasted 20 minutes. From preliminary analysis, the root cause is a hardware issue on an optical (amplification) component affecting Google's user facing backbone. The issue was identified and mitigated by our automated repair mechanism without manual intervention.

Customer Impact:

  • Google Cloud Networking - Affected customers would have observed packet loss for Ingress and egress traffic to and from central US cloud regions.

  • Cloud VPN - Affected customers would have experienced latency and errors for cross region VPN with global routing. Intra-region traffic for Classic and HA VPN would not have been affected.

  • Cloud Router - Affected customers would have observed delays in global routing propagation.

  • Cloud Interconnect - Affected customers would have experienced latency and errors.

  • Google Cloud Storage - Affected customers would have experienced elevated latency, HTTP 500 errors, or transient API errors.

  • Cloud SQL - Affected customers would have experienced failures for Export, Update, and Delete operations as well as delayed data replication in us-west1.

  • Messages - Affected customers would have experienced service availability issues.

20 May 2022 14:52 PDT

The issue with Google Cloud Networking has been resolved for all affected users as of Friday, 2022-05-20 14:07 US/Pacific.

Affected customers would have experienced high latency, timeouts and errors.

We thank you for your patience while we worked on resolving the issue.

20 May 2022 14:44 PDT

Summary: Global : Cloud Networking faced severe packet loss

Description: We've received a report of an issue with Google Cloud Networking as of Friday, 2022-05-20 13:47 US/Pacific.

We will provide more information by Friday, 2022-05-20 14:45 US/Pacific.

Diagnosis: Customers may have encountered elevated latency errors

Workaround: None at this time.