This service is currently experimental and data is provided for test purposes only.
This page provides status information on the services that are part of the Google Cloud Platform. Check back here to view the current status of the services listed below. For additional information on these services, please visit cloud.google.com.
Google BigQuery Incident #18003
Execution of BigQuery query jobs is delayed, queries may take longer than usual to complete
Incident began at 10/13/2014 00:30 and ended at 10/13/2014 13:55.
| Date | Time | Description | |
|---|---|---|---|
| Oct 24, 2014 | 11:07 | SUMMARY: On Monday 13 October 2014, some user-submitted BigQuery jobs experienced increased execution time for a period 13 hours and 18 minutes. If your jobs were affected by this delayed execution, we apologize; we strive to maintain the highest standard of performance and reliability and failed to uphold that standard in this instance. We have implemented changes to both address this issue and monitor and prevent future recurrences of this issue. DETAILED DESCRIPTION OF IMPACT: From 00:37 to 13:55 PDT on Monday 13 October 2014, 1.6% of queries experienced scheduling delays of up to four hours and experienced performance degradation during execution. Affected jobs started to recover by 07:17 and were fully recovered by 13:55. ROOT CAUSE: The BigQuery service received a combination of user queries that led to lock contention in the underlying component responsible for processing large joins and groupings. This lock contention slowed down both scheduling and query execution. REMEDIATION AND PREVENTION: Monitoring systems alerted Google engineers to increased query latency at 02:27. To address the performance of the service, Google engineers restarted several of the service components, and focused on identifying specific affected queries and projects. At 07:27, the engineers redirected traffic to an unaffected datacenter to mitigate the effect on new incoming queries. To prevent further recurrence of this issue, Google engineers have addressed the sources of lock contention responsible for performance degradation, and have added further instrumentation to help on-call engineers quickly identify problematic combinations of user queries. Finally, Google engineers have added more stringent detection and alerting in cases of latency increases. |
|
SUMMARY: On Monday 13 October 2014, some user-submitted BigQuery jobs experienced increased execution time for a period 13 hours and 18 minutes. If your jobs were affected by this delayed execution, we apologize; we strive to maintain the highest standard of performance and reliability and failed to uphold that standard in this instance. We have implemented changes to both address this issue and monitor and prevent future recurrences of this issue. DETAILED DESCRIPTION OF IMPACT: From 00:37 to 13:55 PDT on Monday 13 October 2014, 1.6% of queries experienced scheduling delays of up to four hours and experienced performance degradation during execution. Affected jobs started to recover by 07:17 and were fully recovered by 13:55. ROOT CAUSE: The BigQuery service received a combination of user queries that led to lock contention in the underlying component responsible for processing large joins and groupings. This lock contention slowed down both scheduling and query execution. REMEDIATION AND PREVENTION: Monitoring systems alerted Google engineers to increased query latency at 02:27. To address the performance of the service, Google engineers restarted several of the service components, and focused on identifying specific affected queries and projects. At 07:27, the engineers redirected traffic to an unaffected datacenter to mitigate the effect on new incoming queries. To prevent further recurrence of this issue, Google engineers have addressed the sources of lock contention responsible for performance degradation, and have added further instrumentation to help on-call engineers quickly identify problematic combinations of user queries. Finally, Google engineers have added more stringent detection and alerting in cases of latency increases. |
|||
| Oct 13, 2014 | 14:00 | The problem with Google BigQuery should be resolved as of Monday, 2014-10-13 13:55 US/Pacific. We apologize for any issues this may have caused to you or your users and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are constantly working to improve the reliability of our systems. We will provide a more detailed analysis of this incident once we have completed our internal investigation. |
|
The problem with Google BigQuery should be resolved as of Monday, 2014-10-13 13:55 US/Pacific. We apologize for any issues this may have caused to you or your users and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are constantly working to improve the reliability of our systems. We will provide a more detailed analysis of this incident once we have completed our internal investigation. |
|||
| Oct 13, 2014 | 12:35 | Google BigQuery performance has been restored for most query jobs, and we expect resolution for the remaining affected jobs in the near future. For everyone who is affected, we apologize for any inconvenience you are experiencing. We will provide the next update by 2014-10-13 13:00 (Pacific Time) with further details. |
|
Google BigQuery performance has been restored for most query jobs, and we expect resolution for the remaining affected jobs in the near future. For everyone who is affected, we apologize for any inconvenience you are experiencing. We will provide the next update by 2014-10-13 13:00 (Pacific Time) with further details. |
|||
| Oct 13, 2014 | 11:00 | We are continuing work to correct the ongoing issues with Google BigQuery. By now, we have stabilized the system performance and are closely monitoring the situation as we carry on the investigation into root cause. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide another status update by 2014-10-13 12:00 US/Pacific. |
|
We are continuing work to correct the ongoing issues with Google BigQuery. By now, we have stabilized the system performance and are closely monitoring the situation as we carry on the investigation into root cause. For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide another status update by 2014-10-13 12:00 US/Pacific. |
|||
| Oct 13, 2014 | 10:12 | We are still investigating the issue with Google BigQuery service. Execution of some query jobs are taking longer than usual and may timeout. We will provide another status update by 2014-10-13 11:00 US/Pacific. |
|
We are still investigating the issue with Google BigQuery service. Execution of some query jobs are taking longer than usual and may timeout. We will provide another status update by 2014-10-13 11:00 US/Pacific. |
|||
| Oct 13, 2014 | 09:05 | We're investigating an issue with BigQuery service beginning at around 2014-10-13 00:30 US/Pacific. Execution of some query jobs is taking longer than usual and might timeout. We will provide more information by 2014-10-13 10:00 US/Pacific. |
|
We're investigating an issue with BigQuery service beginning at around 2014-10-13 00:30 US/Pacific. Execution of some query jobs is taking longer than usual and might timeout. We will provide more information by 2014-10-13 10:00 US/Pacific. |
|||
