Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google App Engine

We're investigating an issue with Google App Engine URL Fetch service beginning Tue, 2014-07-01 20:00 (all times are in US/Pacific). We will provide more information shortly.

Incident began at 2014-07-01 18:00 and ended at 2014-07-03 03:20 (all times are US/Pacific).

Date Time Description
9 Jul 2014 08:27 PDT

SUMMARY: Between Tuesday 1 July and Thursday 3 July 2014, some Google App Engine applications experienced elevated errors from the URL Fetch service for a duration of 32 hours and 55 minutes. We apologize if your application was affected. We have already taken steps to improve the URL Fetch service’s reliability.

DETAILED DESCRIPTION OF IMPACT: Some App Engine applications hosted in US datacenters which make significant use of URL Fetch experienced elevated errors when calling the URL Fetch service between Tuesday 1 July 2014 18:00 and Thursday 3 July 2014 02:55 US/Pacific. 1% of applications which call URL Fetch more than once a second, experienced an increase in error rate of 4% or greater. Another 49% of applications which call URL Fetch more than once a second experienced an error rate increase of between 0.5% and 4%. URL Fetch calls to Google APIs (except Cloud Storage) and appspot.com URLs were not affected.

ROOT CAUSE: The incident was caused by an issue with the system that creates outbound connections from Google servers. In one US datacenter, this system had an elevated error rate which was triggered by high load.

REMEDIATION AND PREVENTION: App Engine uses systems in multiple datacenters to create outbound connections. A load balancer determines how many outbound connections are handled by each datacenter. To remediate the issue, we redirected load to other datacenters.

To prevent recurrence, we have increased capacity of the system in the affected datacenter, while keeping the same cap in the amount of traffic that is sent by the load balancer. We will make the same change in all other datacenters. We will also make improvements to the load balancer configuration to reduce the risk of hot spots.

The network connection system currently has black box monitoring that detects failures in connections to servers outside Google. However, this monitoring did not detect this incident, due to the low volume of connection requests. We will send a higher volume of connection requests, to ensure that we can detect any similar issues.

Monitoring of the overall App Engine URL Fetch service uses aggregate metrics in which a small increase in overall error rate can look similar to noise. We will add monitoring that measures the latency and error rate for each application so that we can more easily detect incidents in which a small fraction of applications are significantly affected.

9 Jul 2014 08:27 PDT

The problem with Google App Engine URL Fetch service was resolved as of Thu, 2014-07-03 03:20 (all times are in US/Pacific). We apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better.

9 Jul 2014 08:26 PDT

We are still experiencing a slight increase in error rate for Google App Engine URL Fetch service. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We are seeing rate/latency improvements for many apps. We will provide another update by Thu, 2014-07-01 06:45 (all times are in US/Pacific).

9 Jul 2014 08:25 PDT

We are currently experiencing a slight increase in error rate for Google App Engine URL Fetch service. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide another update by Thu, 2014-07-01 04:45 (all times are in US/Pacific).

9 Jul 2014 08:24 PDT

We are currently experiencing a slight increase in error rate for Google App Engine URL Fetch service. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide an update by Thu, 2014-07-01 02:45 (all times are in US/Pacific) with current details.

9 Jul 2014 08:23 PDT

We're investigating an issue with Google App Engine URL Fetch service beginning Tue, 2014-07-01 20:00 (all times are in US/Pacific). We will provide more information shortly.