Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google BigQuery

Execution of BigQuery query jobs is delayed, queries may take longer than usual to complete

Incident began at 2014-10-13 00:30 and ended at 2014-10-13 13:55 (all times are US/Pacific).

Date Time Description
24 Oct 2014 11:07 PDT

SUMMARY: On Monday 13 October 2014, some user-submitted BigQuery jobs experienced increased execution time for a period 13 hours and 18 minutes. If your jobs were affected by this delayed execution, we apologize; we strive to maintain the highest standard of performance and reliability and failed to uphold that standard in this instance. We have implemented changes to both address this issue and monitor and prevent future recurrences of this issue.

DETAILED DESCRIPTION OF IMPACT: From 00:37 to 13:55 PDT on Monday 13 October 2014, 1.6% of queries experienced scheduling delays of up to four hours and experienced performance degradation during execution. Affected jobs started to recover by 07:17 and were fully recovered by 13:55.

ROOT CAUSE: The BigQuery service received a combination of user queries that led to lock contention in the underlying component responsible for processing large joins and groupings. This lock contention slowed down both scheduling and query execution.

REMEDIATION AND PREVENTION: Monitoring systems alerted Google engineers to increased query latency at 02:27. To address the performance of the service, Google engineers restarted several of the service components, and focused on identifying specific affected queries and projects. At 07:27, the engineers redirected traffic to an unaffected datacenter to mitigate the effect on new incoming queries.

To prevent further recurrence of this issue, Google engineers have addressed the sources of lock contention responsible for performance degradation, and have added further instrumentation to help on-call engineers quickly identify problematic combinations of user queries. Finally, Google engineers have added more stringent detection and alerting in cases of latency increases.

13 Oct 2014 14:14 PDT

The problem with Google BigQuery should be resolved as of Monday, 2014-10-13 13:55 US/Pacific. We apologize for any issues this may have caused to you or your users and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are constantly working to improve the reliability of our systems. We will provide a more detailed analysis of this incident once we have completed our internal investigation.

13 Oct 2014 12:35 PDT

Google BigQuery performance has been restored for most query jobs, and we expect resolution for the remaining affected jobs in the near future. For everyone who is affected, we apologize for any inconvenience you are experiencing. We will provide the next update by 2014-10-13 13:00 (Pacific Time) with further details.

13 Oct 2014 11:24 PDT

We are continuing work to correct the ongoing issues with Google BigQuery. By now, we have stabilized the system performance and are closely monitoring the situation as we carry on the investigation into root cause. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide another status update by 2014-10-13 12:00 US/Pacific.

13 Oct 2014 10:12 PDT

We are still investigating the issue with Google BigQuery service. Execution of some query jobs are taking longer than usual and may timeout. We will provide another status update by 2014-10-13 11:00 US/Pacific.

13 Oct 2014 09:05 PDT

We're investigating an issue with BigQuery service beginning at around 2014-10-13 00:30 US/Pacific. Execution of some query jobs is taking longer than usual and might timeout. We will provide more information by 2014-10-13 10:00 US/Pacific.