Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Cloud Armor, Google Cloud Networking, Cloud Load Balancing

Google Cloud Networking traffic loss visible in us-east4

Incident began at 2023-04-25 07:03 and ended at 2023-04-25 09:42 (all times are US/Pacific).

Previously affected location(s)

Northern Virginia (us-east4)

Date Time Description
3 May 2023 13:33 PDT

Incident Report

Summary

On Tuesday, 25 April 2023 at 07:03 a.m. US/Pacific, Google Cloud Load Balancer customers experienced a low-level network packet loss in us-east4-a for a duration of 2 hours, 39 minutes. Google Cloud Networking experienced intermittent network connectivity issues with Virtual Machines (VMs) and intra-zone packet loss from 08:44 to 08:48 a.m. US/Pacific. We sincerely apologize to our Google Cloud customers who were impacted during this disruption.

Root Cause

During the decommissioning process in one of our data centers, three racks were inadvertently disconnected from their power supplies due to a rack labeling error. These racks hosted the data center network controller applications, so their failure impacted network control plane connectivity and resulted in network traffic falling back to static routing. The failover worked as expected, however, one component of a single networking device handled the backup static routes incorrectly, causing low level packet loss for a small subset of customers. Once power to the control plane was restored, network reconvergence caused up to four minutes of intra-zone network packet loss.

Remediation and Prevention

On Tuesday, 25 April 2023 at 07:03 a.m. US/Pacific, Google engineers were alerted to an issue via internal monitoring impacting the network controller with intermittent packet loss and immediately started an investigation. Once the nature and scope of the issue became clear, the operations team was notified and power was restored to the racks.

Google is committed to preventing a repeat of this issue in the future and is completing the following actions:

  • Audit all rack labels and resolve any discrepancies
  • Train the technicians who are responsible for decommissioning on how to prevent recurrence
  • Implement additional defense mechanisms against traffic impact during cold restart of the network control plane

Google is committed to quickly and continually improving our technology and operations to prevent service disruptions. We appreciate your patience and apologize again for the impact to your organization. We thank you for your business.

Detailed Description of Impact

On Tuesday, 25 April 2023 from 07:03 to 09:42 a.m. US/Pacific, there were several independent packet loss events:

  • Packet loss was observed between 07:03 - 09:42 a.m., when one component of a single networking device handled the backup static routes incorrectly and it caused a low level of packet loss for the traffic between us-east4-a and the rest of the network.
  • Google Cloud customers were affected with the intra fabric packet loss during network controller recovery between 08:44 - 08:48 a.m.
26 Apr 2023 13:26 PDT

Mini Incident Report

We apologize for the inconvenience this service disruption may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support.

(All Times US/Pacific)

Incident Start: 25 April 2023 at 07:03

Incident End: 25 April 2023 at 09:42

Duration: 2 hours, 39 minutes

Affected Services and Features:

Google Cloud Networking - Cloud VPN, Cloud Load Balancer, VPC, Persistent Disk

Regions/Zones: us-east4-a

Description:

Google Cloud Networking experienced connectivity issues with Virtual Machines (VMs) in us-east4-a, and Google Cloud Load Balancer customers experienced a low-level network packet loss for a duration of 2 hours, 39 minutes.

From preliminary analysis, the root cause of the event was a power issue, which impacted the network control plane connectivity and resulted in network traffic falling back to static routing. When the control plane connectivity was restored at 08:40 US/Pacific, there was a brief period of high packet loss while the network reconverged.

Google will complete a full Incident Report in the following days that will provide a full root cause.

Customer Impact:

Cloud VPN - Customers were unable to connect to VMs in us-east4-a.

Google Cloud Load Balancer - Customers experienced low-level network packet loss.

VPC - Customers experienced low level packet loss to us-east4-a over public IPs.

Persistent Disk - Customers experienced 1.5 - 2 minute reads, writes and unmap operations on impacted devices. Primary impact was observed from 08:42 to 08:54 US/Pacific.

25 Apr 2023 09:55 PDT

The issue with Cloud Armor, Cloud Load Balancing, Google Cloud Networking has been resolved for all affected users as of Tuesday, 2023-04-25 09:52 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

25 Apr 2023 08:47 PDT

Summary: Google Cloud Networking traffic loss visible in us-east4

Description: Mitigation work is currently underway by our engineering team.

The mitigation is expected to complete by Tuesday, 2023-04-25 10:00 US/Pacific.

We will provide more information by Tuesday, 2023-04-25 10:15 US/Pacific.

Diagnosis: - Cloud VPN customers in us-east4 may be impacted

  • Load Balancer products and Cloud Armor in us-east4 may be impacted
  • Some VMs in us-east4 may not be reachable

Workaround: None at this time

25 Apr 2023 08:21 PDT

Summary: Google Cloud Networking traffic loss visible in us-east4

Description: We are experiencing an issue with Google Cloud Networking beginning on Tuesday, 2023-04-25 07:03 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Tuesday, 2023-04-25 09:00 US/Pacific with current details.

Diagnosis: - Cloud VPN customers in us-east4 may be impacted

  • Load Balancer products and Cloud Armor in us-east4 may be impacted
  • Some VMs in us-east4 may not be reachable

Workaround: None at this time

25 Apr 2023 08:10 PDT

Summary: Google Cloud Networking traffic loss visible in us-east4

Description: We are experiencing an issue with Google Cloud Networking beginning on Tuesday, 2023-04-25 07:03 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Tuesday, 2023-04-25 09:00 US/Pacific with current details.

Diagnosis: - Cloud VPN customers in us-east4 may be impacted

  • Load Balancer products and Cloud Armor in us-east4 may be impacted

Workaround: None at this time

25 Apr 2023 07:58 PDT

Summary: Google Cloud Networking traffic loss visible in us-east4

Description: We are experiencing an issue with Google Cloud Networking beginning on Tuesday, 2023-04-25 07:03 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Tuesday, 2023-04-25 08:30 US/Pacific with current details.

Diagnosis: None at this time

Workaround: None at this time