Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Cloud Storage

Google Cloud Storage (GCS) in europe-west1 is experiencing unavailability errors

Incident began at 2022-11-10 00:04 and ended at 2022-11-10 08:07 (all times are US/Pacific).

Previously affected location(s)

Belgium (europe-west1)

Date Time Description
21 Nov 2022 12:37 PST

Incident Report

Summary

Starting on 10 November 2022 at 00:04 PST customers of Google Cloud Storage (GCS) and Google BigQuery may have seen intermittent error messages while using these services in europe-west1 for a duration of 8 hours and 3 minutes.

To our GCS and BigQuery customers who were impacted during this disruption, we sincerely apologize. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability. We have conducted an internal investigation and are taking steps to improve our service.

Root Cause

This issue was caused by a recent rollout that was intended to improve maintenance, efficiency, and supportability by sharing internal data requests to new jobs. However, due to an issue in the rollout, the migration of the data resulted in transient faults and caused a failure rate of up to 7% of read and 19% of write traffic in europe-west1.

Remediation and Prevention

Google Engineers were alerted to this issue and immediately started to investigate the issue. Engineers identified the problematic rollout above; however, the first two attempts to roll back this change were not effective to resolve the issue. Google engineers then identified a quicker and more effective, direct mitigation. This change took a couple of minutes to complete and fully mitigated the impact.

Google is committed to preventing a repeat of this issue in the future and is taking the following actions:

  • We have postponed the internal data migration rollout until all critical preventative action measures are resolved.
  • Going forward, we will pause binary rollouts and stop retry updates after Google engineers get alerted to paging events.
  • We will implement a two-stage canary deployment for rollouts to reduce the percentage of impacted tasks due to catastrophic error.

Google is committed to quickly and continually improving our technology and operations to prevent service disruptions. We appreciate your patience and apologize again for the impact to your organization. We thank you for your business.

Detailed Description of Impact

On Thursday, 10 November 2022 from 00:04 US/Pacific to 08:07 US/Pacific, 7% of read and 19% of write traffic in europe-west1 region was failing. All of the impact was limited to data operations, including read, write, rewrite, clone, compose, and upload of objects. Customers in the europe-west1 region may have experienced the following symptoms during this period:

  • Affected GCS customers may have received HTTP 503 errors for read/write operations in europe-west1. Metadata operations such as object listing continued to work successfully.
  • Affected customers of Google BigQuery may have received “INTERNAL_ERROR” when running import jobs in europe-west1 during the impact window.
10 Nov 2022 14:42 PST

Mini Incident Report

We apologize for the inconvenience this service disruption may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

Incident Start: 10 November 2022 00:04

Incident End: 10 November 2022 08:07

Duration: 8 hours, 3 minutes

Affected Services and Features:

Google Cloud Storage Google BigQuery

Regions/Zones: europe-west1

Description:

Google Cloud Storage experienced intermittent unavailability errors for a period of 8 hours and 3 minutes in europe-west1. From a preliminary analysis, the root cause of the issue was related to a recent change to network traffic routing. This change was rolled back to successfully mitigate the issue. Google will be providing a full Incident Report that will provide additional root cause information.

Customer Impact:

  • Google Cloud Storage customers would have received HTTP 503 errors for read/write operations in europe-west1. Metadata operations such as object listing continued to work successfully.
  • Google BigQuery customers may have received “INTERNAL_ERROR” when running import jobs in europe-west1 during the impact window.
10 Nov 2022 08:41 PST

The issue with Google Cloud Storage has been resolved for all affected users as of Thursday, 2022-11-10 08:07 US/Pacific.

The mitigation applied by our engineering team worked as expected

We thank you for your patience while we worked on resolving the issue.

10 Nov 2022 08:27 PST

Summary: Google Cloud Storage (GCS) in europe-west1 is experiencing unavailability errors

Description: We are experiencing an intermittent issue with Google Cloud Storage beginning on Thursday, 2022-11-10 00:04:43 PST US/Pacific.

This issue was suspected to be caused by a recently rolled out update. The Engineering team is rolling back the update and current data indicates that roll back is effective in mitigating this issue .

The mitigation is expected to completed by Thursday, 2022-11-10 08:40 US/Pacific.

We will provide more information by Thursday, 2022-11-10 09:00 US/Pacific.

Diagnosis: GCS users will experience 503 errors for many operations

Workaround: None at this time

10 Nov 2022 07:17 PST

Summary: Google Cloud Storage (GCS) in europe-west1 is experiencing unavailability errors

Description: We are experiencing an intermittent issue with Google Cloud Storage beginning on Thursday, 2022-11-10 00:04:43 PST US/Pacific.

Mitigation work is currently underway by our engineering team.

We do not have an ETA for mitigation at this point.

We will provide more information by Thursday, 2022-11-10 08:35 US/Pacific.

Diagnosis: GCS users will experience 503 errors for many operations

Workaround: None at this time

10 Nov 2022 06:41 PST

Summary: Google Cloud Storage (GCS) in europe-west1 is experiencing unavailability errors

Description: We are experiencing an intermittent issue with Google Cloud Storage beginning on Thursday, 2022-11-10 00:04:43 PST US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Thursday, 2022-11-10 07:20 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: GCS users will experience 503 errors for many operations

Workaround: None at this time

10 Nov 2022 06:14 PST

Summary: Google Cloud Storage (GCS) in europe-west1 is experiencing unavailability errors

Description: We are experiencing an intermittent issue with Google Cloud Storage beginning on Thursday, 2022-11-10 05:20:04 PST US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Thursday, 2022-11-10 06:45 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: GCS users will experience errors for many operations

Workaround: None at this time

10 Nov 2022 06:08 PST

Summary: Google Cloud Storage (GCS) in europe-west1 is experience unavailable errors

Description: We are experiencing an intermittent issue with Google Cloud Storage beginning on Thursday, 2022-11-10 05:20:04 PST US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Thursday, 2022-11-10 06:45 US/Pacific with current details.

We apologize to all who are affected by the disruption.

Diagnosis: GCS users will experience errors for many operations

Workaround: None at this time