Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google App Engine

Task queue delays in dispatching tasks. Files API errors creating files.

Incident began at 2014-09-29 19:30 and ended at 2014-09-30 09:00 (all times are US/Pacific).

Date Time Description
1 Oct 2014 13:55 PDT

SUMMARY:

On Monday 29 September 2014, some Google App Engine applications using the Task Queue API experienced a decrease in the dispatch rate for tasks for a period of 2 hours and 28 minutes. In addition, on Monday 29 September and Tuesday 30 September 2014, some App Engine applications experienced errors when creating files using the Files API for a period of 11 hours and 2 minutes.

We hold ourselves to a high standard, and we failed to meet that standard. We are taking action to ensure that incidents like this do not happen in the future.

DETAILED DESCRIPTION OF IMPACT:

From Monday 29 September 2014 19:30 to 21:58 PDT, 29% of App Engine applications using the Task Queue API in US datacenters experienced a decrease in the dispatch rate for tasks. During the incident, tasks were dispatched at 78% of the rate seen during the previous day at the same time.

From Monday 29 September 21:58 until Tuesday 30 September 09:00, 27% of App Engine applications using the Files API in US datacenters experienced errors when creating files. The error rate for affected applications during this period was 95%.

ROOT CAUSE:

Both the task queue dispatch issue and Files API issue were ultimately caused by a failure in the storage layer in one US datacenter. Initially, the impact of the storage layer issue was limited to a drop in the task queue dispatch rate. We later determined that its impact would become more severe. We therefore redirected all App Engine traffic to other datacenters. This change exposed a latent misconfiguration in the Files API, which caused affected applications to experience errors when creating files.

REMEDIATION AND PREVENTION:

The App Engine support team received the first customer report of a drop in the task queue dispatch rate at 20:31. To resolve this issue, our engineers moved task queue operations for affected applications to other datacenters at 21:58.

At 22:54, our engineers moved all App Engine traffic away from the affected datacenter, which led to the Files API errors. Our engineers diagnosed and fixed the Files API issue at 07:41. The fix was fully rolled out to all affected customers by 09:00.

For customers using the Files API, which is now deprecated (http://googlecloudplatform.blogspot.com/2013/06/google-app-engine-181-released.html), we recommend that you migrate your code to use the Cloud Storage client library instead:

https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/ https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/

Our support team will contact customers that make significant use of the Files API and provide help to move their code to a fully supported solution.

29 Sep 2014 23:04 PDT

The problem with Google App Engine Task Queue lower processing rate was fully resolved as of Monday, 2014-09-29 22:13 US/Pacific. We apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better.

29 Sep 2014 23:02 PDT

We're investigating an issue with Google App Engine Task Queue beginning at Monday, 2014-09-29 19:30 US/Pacific. We will provide more information within the next 30 minutes.