Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Compute Engine

Issue with Network Connectivity on April 10th, 2015

Incident began at 2015-04-10 02:10 and ended at 2015-04-10 02:24 (all times are US/Pacific).

Date Time Description
13 Apr 2015 04:37 PDT

SUMMARY:

On Friday 10 April 2015, Google Compute Engine instances in us-central1 experienced elevated packet loss for a duration of 14 minutes. If your service or application was affected, we apologize — this is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability.

DETAILED DESCRIPTION OF IMPACT:

On Friday 10 April 2015 from 02:10 to 02:24 PDT, instances hosted in Google Compute Engine zone us-central1-b experienced elevated packet loss for internal (VM <-> VM) traffic, and every zone in region us-central1 experienced elevated packet loss for external (Internet <-> VM) traffic. The impact varied on different network paths e.g., for VM to VM and VM to Internet reported packet loss was between 26 to 47% at peak, while for Internet to VM 18 to 34% of total packets were lost.

ROOT CAUSE:

During routine planned maintenance a miscommunication resulted in traffic being sent to a datacenter router that was running a test configuration. This resulted in a saturated link, causing packet loss. The faulty configuration became effective at 02:10 and caused traffic congestion soon after.

REMEDIATION AND PREVENTION:

Google Engineers were notified by our alerting systems at 02:12 and confirmed an unusually high rate of packet loss at 02:18. At 02:21 Google Engineers disabled the problematic router, distributing traffic to other, unsaturated links. Normal operation was restored at 02:24.

To prevent similar incidents in future, we are changing procedure to include additional validation checks while configuring routers during maintenance activities. We are also implementing a higher degree of automation to remove potential human and communication errors when changing router configurations.

10 Apr 2015 02:56 PDT

The problem with Google Compute Engine network connectivity was resolved as of 02:24 US/Pacific on 10th April 2015. We apologize for any issue this may have caused you or your users and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google and we are constantly working to improve the reliability of our systems.

10 Apr 2015 02:46 PDT

We are currently investigating an issue with Google Compute Engine network connectivity. We will provide an update by Friday 10th April 2015 03:30 PST.