Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Cloud Networking

Customers experienced a cloud networking disruption from 04:28 AM - 04:50 AM US/Pacific

Incident began at 2022-09-22 04:28 and ended at 2022-09-22 04:50 (all times are US/Pacific).

Previously affected location(s)

Iowa (us-central1)South Carolina (us-east1)Oregon (us-west1)

Date Time Description
3 Oct 2022 12:35 PDT

INCIDENT REPORT

Summary

On Friday, 22 September 2022, Google Cloud experienced a traffic disruption in the wide-area network connecting the us-east1 and us-central1 cloud regions. Inter-region traffic in Google Cloud, and Internet-to-Google Cloud traffic, may have been disrupted if it transited this network path. We are aware of potential impact in several Cloud regions including asia-east1, asia-northeast1, asia-southeast1, australia-southeast1, europe-west1, europe-west2, europe-west3, europe-west4, northamerica-northeast1, us-central1, us-east1, us-east4, us-west1, us-west2, us-west4, as well as to Google Workspace, with a total duration of 22 minutes.

Root Cause

The traffic disruption in Google's wide-area network was triggered by brief failures in fiber-optic cables, in the presence of a pre-existing failure nearby in the network.

These brief failures occurred progressively across a 18-minute period on Friday, 22 September 2022, from 04:28 to 04:46 US/Pacific. Each event required a rerouting of traffic, extending the impact to 04:50.

The pre-existing failure occurred on Wednesday, 20 September 2022 22:10 US/Pacific, and was still under repair at the time of the second failure on Friday 22nd September.

Google's interregional backbone is designed with multiple levels of redundancy and is provisioned to reroute Cloud traffic with minimal disruption under all common failure scenarios. In this case, the backbone was designed with appropriate redundancy to survive this dual-failure scenario, but traffic in the affected regions experienced longer rerouting delays. Traffic flowing over other network links experienced disruption as rerouted traffic sought alternate, less congested backup paths.

Remediation and Prevention

Google's network reacted automatically to the 04:28 to 04:46 events, rerouting within our design goals and fully recovered by 04:50.

Our network controls software automatically removed the impacted links from service for our engineers to investigate, since unreliable paths cause more short-term impact than failed paths. There was no shortage of capacity at any time; all disruptions were caused by rerouting.

The probability and impact of these scenarios is exhaustively modeled to ensure such double failures occur very infrequently and do not exceed long-term (yearly & multi-year) availability targets.

Google is committed preventing a repeat of this issue in the future and is completing the following actions:

  • Verify fiber-optic cable maintenance procedures adequately manage the risk of physical interruption during repair and other maintenance activity, including hitless proactive traffic moves where appropriate.
  • Ensuring capacity modeling software is correctly assessing the risk of dual failures, and allowing more headroom for rerouted traffic where those failures are more likely.
  • Ensuring automatic removal of unreliable capacity is acting aggressively enough to avoid areas with progressive failures.

Detailed Description of Impact

On 22 September 2022, between 04:28 to 04:50 US/Pacific unless otherwise noted the following services (but not limited to) may have been impacted for various customers in the following cloud regions: asia-east1, asia-northeast1, asia-southeast1, australia-southeast1, europe-west1, europe-west2, europe-west3, europe-west4, northamerica-northeast1, us-central1, us-east1, us-east4, us-west1, us-west2, us-west4, unless otherwise noted.

Google Compute Engine

Affected Google Compute Engine customers may have experienced increased latency and packet loss between Compute Engine instances in affected regions.

Google Cloud BigTable

A small percentage of customers may have experienced errors in API calls from 04:29 through 04:51 in asia-east1, us-central1, us-south1, us-west1, and us-west4.

Google Chat

Affected Google Chat customers would have experienced errors when accessing, creating, or responding to chats from 04:28 through 04:51.

Google Voice

Google Voice users might have experienced some of their actions failing during the impact window due to an internal error. This includes all actions such as sending SMS, placing & receiving calls, loading call history, etc. Affected web, Android, and iOS users.

Google Cloud Storage

A small percentage of Google Cloud Storage customers may have experienced errors in requests to GCS buckets in asia-south2, asia-southeast2, us-west1 and us-west2.

Google Cloud Load Balancing

Affected Google Cloud Load Balancing customers may have experienced increased HTTP 5XX errors. Globally around 3M queries were served with 5XX response during the two outage windows and us-west1, asia-east1, asia-south1, asia-south2 and asia-southeast1 saw the most of the failing queries.

22 Sep 2022 14:22 PDT

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support .

(All Times US/Pacific)

Occurrence 1

Incident Start: 22 September 2022 04:30

Incident End: 22 September 2022 04:38

Duration: 8 minutes

Occurrence 2

Incident Start: 22 September 2022 04:48

Incident End: 22 September 2022 04:58

Duration: 10 minutes

Affected Services and Features:

Google Cloud Networking

Regions/Zones: us-central1, us-east1, us-west1

Description:

Customers using Google Cloud Networking experienced a network traffic disruption in us-central1, us-east1, us-west1 regions on 22 September 2022 for 8 minutes starting 04:30 US/Pacific and for 10 minutes starting 04:48 US/Pacific (total duration of 18 minutes). From preliminary analysis, the root cause of the issue was identified as failures of a high fraction of transport links between the affected regions.

Customer Impact:

The incident had the following impact for our customers. Some customers using Cloud Networking experienced severe traffic disruption for the two occurrences of the incident. Some cloud customers communicating outside the affected regions (including to the Internet) would have seen two periods of disruption, ~8 minutes at 04:30 AM, ~10 minutes at 04:48 AM US/Pacific.

22 Sep 2022 05:45 PDT

The issue with Google Cloud Networking has been resolved for all affected users as of Thursday, 2022-09-22 05:23 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

22 Sep 2022 05:30 PDT

Summary: Customers experienced a cloud networking disruption from 04:30 AM - 04:58 AM US/Pacific

Description: Customers might have experienced a cloud networking disruption from 04:30 AM - 04:58 AM US/Pacific as a result of an issue on physical network.

We believe the network connectivity is currently stable.

We will provide an update by Thursday, 2022-09-22 06:45 US/Pacific with current details.

Diagnosis: All cloud customers communicating outside the region (including to the internet) would have seen two periods of disruption, ~8m at 04:30 AM, ~10m at 04:48 AM US/Pacific

Workaround: None