Google Cloud Status Dashboard

This page provides status information on the services that are part of Google Cloud Platform. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit cloud.google.com.

Google BigQuery Incident #18014

BigQuery streaming inserts delayed in EU

Incident began at 2016-04-05 18:00 and ended at 2016-04-06 12:00 (all times are US/Pacific).

Date Time Description
Apr 11, 2016 06:29

SUMMARY

On Wednesday 5 April and Thursday 6 April 2016, some streaming inserts to BigQuery datasets in the EU were delayed by up to 16 hours and 46 minutes. We sincerely apologise for these delays and we are addressing the root causes of the issue as part of our commitment to BigQuery's availability and responsiveness.

DETAILED DESCRIPTION OF IMPACT

From 15:16 PDT to 23:40 on Wednesday 05 April 2016, some BigQuery streaming inserts to datasets in the EU did not immediately become available to subsequent queries. From 23:40, new streaming inserts worked normally, but some previously delayed inserts remained unavailable to BigQuery queries. Virtually all delayed inserts were committed and available by 07:52 on Thursday 06 April.

The event was accompanied by slightly elevated error rates (< 0.7% failure rate) and latency (< 50% latency increase) of API calls for streaming inserts.

ROOT CAUSE

BigQuery streaming inserts are buffered in one of Google's large-scale storage systems before being committed to the main BigQuery repository. At 15:16 PDT on Wednesday 05 April, this storage system began to experience issues in one of the datacenters that host BigQuery datasets in the EU, blocking BigQuery's I/O operations for streaming inserts. The impact reached monitoring threshold levels after a few hours, and at 18:29 automated monitoring systems sent alerts to the Google engineering team, but the monitoring systems displayed the alerts in a way that disguised the scale of the issue and made it seem to be a low priority. This error was identified at 23:01, and Google engineers began routing all European streaming insert traffic to another EU datacenter, restoring normal insert behaviour by 23:40. The delayed inserts in the system were committed when the underlying storage system was restored to service.

REMEDIATION AND PREVENTION

Google engineers are addressing the technical root cause of the incident by increasing the fault-tolerance of I/O between BigQuery and the storage system that buffers streaming inserts.

The principal remediation efforts for this event, however, are focused on the systems monitoring, alert escalation, and data visualisation issues which were involved. Google engineers are updating the BigQuery monitoring systems to more clearly represent the scale of system behaviour, and modifying internal procedures and documentation accordingly.

SUMMARY

On Wednesday 5 April and Thursday 6 April 2016, some streaming inserts to BigQuery datasets in the EU were delayed by up to 16 hours and 46 minutes. We sincerely apologise for these delays and we are addressing the root causes of the issue as part of our commitment to BigQuery's availability and responsiveness.

DETAILED DESCRIPTION OF IMPACT

From 15:16 PDT to 23:40 on Wednesday 05 April 2016, some BigQuery streaming inserts to datasets in the EU did not immediately become available to subsequent queries. From 23:40, new streaming inserts worked normally, but some previously delayed inserts remained unavailable to BigQuery queries. Virtually all delayed inserts were committed and available by 07:52 on Thursday 06 April.

The event was accompanied by slightly elevated error rates (< 0.7% failure rate) and latency (< 50% latency increase) of API calls for streaming inserts.

ROOT CAUSE

BigQuery streaming inserts are buffered in one of Google's large-scale storage systems before being committed to the main BigQuery repository. At 15:16 PDT on Wednesday 05 April, this storage system began to experience issues in one of the datacenters that host BigQuery datasets in the EU, blocking BigQuery's I/O operations for streaming inserts. The impact reached monitoring threshold levels after a few hours, and at 18:29 automated monitoring systems sent alerts to the Google engineering team, but the monitoring systems displayed the alerts in a way that disguised the scale of the issue and made it seem to be a low priority. This error was identified at 23:01, and Google engineers began routing all European streaming insert traffic to another EU datacenter, restoring normal insert behaviour by 23:40. The delayed inserts in the system were committed when the underlying storage system was restored to service.

REMEDIATION AND PREVENTION

Google engineers are addressing the technical root cause of the incident by increasing the fault-tolerance of I/O between BigQuery and the storage system that buffers streaming inserts.

The principal remediation efforts for this event, however, are focused on the systems monitoring, alert escalation, and data visualisation issues which were involved. Google engineers are updating the BigQuery monitoring systems to more clearly represent the scale of system behaviour, and modifying internal procedures and documentation accordingly.

Apr 06, 2016 12:00

The issue with BigQuery job execution have been fully resolved. Affected customers will be notified directly in order to assess any potential lingering impact. We will also provide a more detailed analysis of this incident once we have completed our internal investigation.

The issue with BigQuery job execution have been fully resolved. Affected customers will be notified directly in order to assess any potential lingering impact. We will also provide a more detailed analysis of this incident once we have completed our internal investigation.

Apr 06, 2016 10:02

Current data indicates that BigQuery streaming inserts are being applied normally. We are still working on restoring the visibility of some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific. We will provide another status update by 12:00 US/Pacific with current details.

Current data indicates that BigQuery streaming inserts are being applied normally. We are still working on restoring the visibility of some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific. We will provide another status update by 12:00 US/Pacific with current details.

Apr 06, 2016 08:54

Current data indicates that BigQuery streaming inserts are being applied normally. Some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific are not yet visible in BigQuery and we are working to propagate them. We will provide another status update by 10:00 US/Pacific with current details.

Current data indicates that BigQuery streaming inserts are being applied normally. Some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific are not yet visible in BigQuery and we are working to propagate them. We will provide another status update by 10:00 US/Pacific with current details.

Apr 06, 2016 03:40

Current data indicates that BigQuery streaming inserts are being applied normally. Some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific are not yet visible in BigQuery and we are working to propagate them. We will provide another status update by 04:30 US/Pacific with current details.

Current data indicates that BigQuery streaming inserts are being applied normally. Some streaming inserts to EU datasets from 18:00 to 00:00 US/Pacific are not yet visible in BigQuery and we are working to propagate them. We will provide another status update by 04:30 US/Pacific with current details.

Apr 06, 2016 03:04

We are still investigating the issue with BigQuery job execution. Current data indicates that the issue only affects projects which use streaming inserts to datasets located in the EU. We will provide another status update by 04:00 US/Pacific with current details.

We are still investigating the issue with BigQuery job execution. Current data indicates that the issue only affects projects which use streaming inserts to datasets located in the EU. We will provide another status update by 04:00 US/Pacific with current details.

Apr 06, 2016 02:18

We are still investigating the issue with BigQuery job execution. Current data indicates that the issue only affects projects which use streaming inserts to datasets located in the EU. We will provide another status update by 03:00 US/Pacific with current details.

We are still investigating the issue with BigQuery job execution. Current data indicates that the issue only affects projects which use streaming inserts to datasets located in the EU. We will provide another status update by 03:00 US/Pacific with current details.

Apr 06, 2016 01:01

We are still investigating the issue with BigQuery Job execution. We will provide another status update by 02:00 US/Pacific with current details.

We are still investigating the issue with BigQuery Job execution. We will provide another status update by 02:00 US/Pacific with current details.

Apr 06, 2016 00:24

We are still investigating the issue with Bigquery Job execution. We will provide another status update by 01:00 US/Pacific with current details.

We are still investigating the issue with Bigquery Job execution. We will provide another status update by 01:00 US/Pacific with current details.

Apr 05, 2016 23:46

We are investigating an issue with BigQuery Job execution. We will provide more information by 2016-04-06 00:20 US/Pacific.

We are investigating an issue with BigQuery Job execution. We will provide more information by 2016-04-06 00:20 US/Pacific.