Google Cloud Status

This page provides status information on the services that are part of the Google Cloud Platform. Check back here to view the current status of the services listed below. For additional information on these services, please visit cloud.google.com.

Google App Engine Incident #15019

Task queues not able to execute

Incident began at 2015-06-16 20:10 and ended at 2015-06-16 23:35 (all times are US/Pacific).

Date Time Description
Jun 18, 2015 19:57

SUMMARY:

On Tuesday, 16 June 2015, Google App Engine Task Queue service and App Engine application deployment experienced increased error rates for a duration of 3 hours and 25 minutes. If your service or application was affected, we apologize. We have taken actions to fix the issue and are in process of making the system more reliable.

DETAILED DESCRIPTION OF IMPACT:

On Tuesday, 16 June 2015 from 20:10 to 23:35 PDT, some developers of Google App Engine applications in the US region were unable to deploy their applications. The overall error rate of deployments during this period was approximately 60%. Affected developers saw that attempted deployments would exit and report an internal server error message after HTTP requests to appengine.google.com timed out. App Engine Admin Console was unable to load data for affected applications. Additionally, between 20:58 to 21:33, applications in the US region experienced an increase in error rate of up to 0.25% as well as slower execution of Task Queue tasks.

ROOT CAUSE:

Google engineers had performed maintenance on a storage system of one of datacenters which App Engine uses. During this maintenance, components of App Engine that rely on this storage system had to rely on a replica in a different datacenter. For both deployments and Task Queues, this switch did not function properly.

REMEDIATION AND PREVENTION:

Google engineers took necessary measures to prevent the Task Queue service from accessing the storage under the maintenance at 21:33. In addition, all traffic for the affected applications was redirected to alternate datacenters at 23:26. This was completed by 23:35 and applications were again able to deploy successfully.

To prevent the issue from recurring, we are working to make deployments and Task Queue are more resilient to movements in the underlying storage system, in a similar fashion to other App Engine components.

SUMMARY:

On Tuesday, 16 June 2015, Google App Engine Task Queue service and App Engine application deployment experienced increased error rates for a duration of 3 hours and 25 minutes. If your service or application was affected, we apologize. We have taken actions to fix the issue and are in process of making the system more reliable.

DETAILED DESCRIPTION OF IMPACT:

On Tuesday, 16 June 2015 from 20:10 to 23:35 PDT, some developers of Google App Engine applications in the US region were unable to deploy their applications. The overall error rate of deployments during this period was approximately 60%. Affected developers saw that attempted deployments would exit and report an internal server error message after HTTP requests to appengine.google.com timed out. App Engine Admin Console was unable to load data for affected applications. Additionally, between 20:58 to 21:33, applications in the US region experienced an increase in error rate of up to 0.25% as well as slower execution of Task Queue tasks.

ROOT CAUSE:

Google engineers had performed maintenance on a storage system of one of datacenters which App Engine uses. During this maintenance, components of App Engine that rely on this storage system had to rely on a replica in a different datacenter. For both deployments and Task Queues, this switch did not function properly.

REMEDIATION AND PREVENTION:

Google engineers took necessary measures to prevent the Task Queue service from accessing the storage under the maintenance at 21:33. In addition, all traffic for the affected applications was redirected to alternate datacenters at 23:26. This was completed by 23:35 and applications were again able to deploy successfully.

To prevent the issue from recurring, we are working to make deployments and Task Queue are more resilient to movements in the underlying storage system, in a similar fashion to other App Engine components.

Jun 17, 2015 00:20

The issue with application deployment was resolved as of Wednesday, 2015-06-17 00:00. Again we do apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better. We will provide a more detailed analysis of this incident once we have completed our internal investigation.

The issue with application deployment was resolved as of Wednesday, 2015-06-17 00:00. Again we do apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better. We will provide a more detailed analysis of this incident once we have completed our internal investigation.

Jun 16, 2015 23:51

We are continuing to investigate the issue with application deployment and will provide a further update by Wednesday, 2015-06-17 00:20.

We are continuing to investigate the issue with application deployment and will provide a further update by Wednesday, 2015-06-17 00:20.

Jun 16, 2015 23:23

The issue with application deployments is ongoing; symptoms of a deployment failure are posted below.  We are continuing to investigate this and a further update will be posted in 30 minutes.

--

Error posting to URL:https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

500 Internal Server Error

500 Server Error

Error: Server Error

The server encountered an error and could not complete your request.

Please try again in 30 seconds.

This is try #0

[TIMESTAMP] com.google.appengine.tools.admin.AbstractServerConnection send1

Error posting to URL:https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

503 Service Unavailable

Try Again (503)An unexpected failure has occurred. Please try again.

The issue with application deployments is ongoing; symptoms of a deployment failure are posted below.  We are continuing to investigate this and a further update will be posted in 30 minutes.

--

Error posting to URL:https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

500 Internal Server Error

500 Server Error

Error: Server Error

The server encountered an error and could not complete your request.

Please try again in 30 seconds.

This is try #0

[TIMESTAMP] com.google.appengine.tools.admin.AbstractServerConnection send1

Error posting to URL:https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

503 Service Unavailable

Try Again (503)An unexpected failure has occurred. Please try again.

Jun 16, 2015 22:52

We are continuing to investigate the issue with application deployment and will provide a further update  by Tuesday, 2015-06-16 23:20.

We are continuing to investigate the issue with application deployment and will provide a further update  by Tuesday, 2015-06-16 23:20.

Jun 16, 2015 22:20

The problem with Google App Engine Task Queue was resolved as of Tuesday, 2015-06-16 21:35 (all times are in US/Pacific), however some users may continue to experience difficulties with application deployment. We are continuing to investigate this and will provide a further update by Tuesday, 2015-06-16 22:50 with current details. Currently, this service disruption is affecting less than 8% of users.

We apologize for the inconvenience and thank you for your patience and continued support.

The problem with Google App Engine Task Queue was resolved as of Tuesday, 2015-06-16 21:35 (all times are in US/Pacific), however some users may continue to experience difficulties with application deployment. We are continuing to investigate this and will provide a further update by Tuesday, 2015-06-16 22:50 with current details. Currently, this service disruption is affecting less than 8% of users.

We apologize for the inconvenience and thank you for your patience and continued support.

Jun 16, 2015 21:51

We're investigating an issue with Google App Engine task queues beginning at Tuesday, 2015-06-16 20:00 (all times are in US/Pacific). Users may also experience issues with application deployment. We will provide more information within 30 minutes.

We're investigating an issue with Google App Engine task queues beginning at Tuesday, 2015-06-16 20:00 (all times are in US/Pacific). Users may also experience issues with application deployment. We will provide more information within 30 minutes.