Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google BigQuery, Operations, Google Compute Engine, Google Cloud Bigtable, Cloud Logging, Google Cloud Storage, Google Cloud Console, Identity and Access Management, Access Context Manager, Cloud Firestore

We are investigating a potential issue with multiple Google Cloud services.

Incident began at 2022-11-14 10:50 and ended at 2022-11-14 11:38 (all times are US/Pacific).

Previously affected location(s)

Global

Date Time Description
23 Nov 2022 09:41 PST

Incident Report

Summary

On 14 November 2022, multiple Google Workspace and Google Cloud Platform (GCP) services experienced elevated error rates affecting a small percentage of customers for a duration of 38 minutes and 48 minutes respectively. To our Google Workspace and Google Cloud customers whose businesses were impacted during this outage, we sincerely apologize. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability. We have conducted an internal investigation and are taking steps to improve our service.

Root Cause

Google Workspace services (for eligible customers) rely on Context-Aware Access [1] and GCP services rely on Access Context Manager [2]. These are a unified set of infrastructure services (henceforth referred to as "Access Context Manager") that are responsible for determining what access levels a request has based on the level of trust a company's security policy places in that request. The customer must enable Access Context Manager.

On 14 November at 09:25 PT, Google engineers were alerted to an issue in which some background tasks were experiencing resource exhaustion with Access Context Manager, which authorizes enhanced verification access for enterprise customers.

During an initial attempt to resolve the issue, an inadvertent decrease in capacity for the service occurred. This caused authentication traffic to be directed to a small number of tasks, which were unable to handle the load. While internal safeguards successfully isolated the impact to a single data center, thus reducing the impact further, they were unable to stop the service tasks from repeatedly restarting or crashing. In the absence of access-level responses from the system, dependent Google Cloud and Google Workspace services acted correctly and fail-closed, as they are supposed to when they do not have a clear signal that access is permitted.

Remediation and Prevention

Google engineers redirected traffic away from the affected Access Context Manager infrastructure at 11:19 PT and routed it to data centers that had available capacity to manage load. Service was restored by 11:28 PT for Workspace and 11:38 PT for Google Cloud. Google is committed preventing a repeat of this issue in the future and is completing the following actions:

  • Update Access Context Manager playbooks to include better information on baseline production task configurations to prevent decreasing capacity to unsafe levels and use commands that cannot mistake growing and shrinking.
  • Re-evaluate the review/approval process for Access Context Manager capacity changes to ensure that request reviewers have better visibility into the changes being requested to prevent unintentional reductions in capacity.

Detailed Description of Impact

Google Workspace Product Impact: A small number of users of the following Google Workspace products experienced HTTP 500 Internal Server Errors: Calendar, Tasks, Gmail, Drive, Docs, Meet, Keep, Jamboard, and Chat between 10:50 PT to 11:28 PT. Consumer users were not affected.

  • Calendar, Tasks, Gmail, Drive, Docs, Meet, Keep, Jamboard, Chat: Affected users experienced HTTP 500 Internal Server Errors.
  • Admin Console: Customers' admin consoles displayed generic logos, rather than company specific logos.
  • Groups: Groups User Interface and Settings API served HTTP 500 Internal Server Errors to all users.

Google Cloud Product Impact : A small number of users of the following GCP may have experienced increased error rates and unavailability between 10:50 PT to 11:38 PT in the following regions.

  • Google Compute Engine: Affected customers experienced elevated API traffic error rates up to 6% in the following regions: asia-east2, asia-northeast2, asia-northeast3, asia-northeast 1, asia-south1, asia-south2, asia-southeast1, asia-southeast2, australia-southeast1, europe-west9, and southamerica-east1.
  • Google Cloud Console: Approximately 3% of customers were affected, experiencing increased error rates of up to 0.5%.
  • Resource Manager: Affected customers may have seen elevated error rates up to 2.6% and seen generic status unavailable errors when using the API.
  • Google BigQuery: Affected customers experienced elevated API traffic error rates up to 10% when using the - BigQuery storage read and write APIs in southamerica-west1 and in the BigQuery US multi-region.
  • Cloud Firestore: Up to 7.5% of customer projects experienced elevated latency, operation deadline expirations, and/or error messages that state Policy Checks are unavailable. The impact was restricted to projects hosted in nam5 US multi-region, which includes us-central1.
14 Nov 2022 22:03 PST

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

Google Workspace Impact Start: 14 November 2022 10:50

Google Workspace Impact End : 14 November 2022 11:27

Duration: 37 minutes

Google Cloud Platform Impact Start: 14 November 2022 10:50

Google Cloud Platform Impact End: 14 November 2022 11:38

Duration: 48 minutes

Description:

Multiple Google Workspace and Google Cloud Platform services experienced elevated error rates for a duration of 37 minutes and 48 minutes respectively. From preliminary analysis, the root cause is an issue with Google’s system that authorizes access for enterprise customers.

Customer Impact:

Google Workspace Product Impact:

  • Calendar - Affected customers encountered error 500 when accessing calendar.google.com.
  • Tasks - Affected customers encountered error 500 for Google tasks
  • Gmail - Affected customers encountered error 500 when accessing Gmail
  • Drive - Affected customers encountered error 500 when accessing drive.google.com
  • Docs - Affected customers encountered errors when opening or auto saving documents
  • Meet - Affected customers encountered error 500 when accessing Meet.google.com.
  • Voice - Affected customers saw an increase in latency while attempting to use voice.google.com
  • Keep - Affected customers encountered error 500 while accessing Google Keep.
  • Jamboard - Affected users saw error 500 using Jamboard services.
  • Chat - Affected customers encountered error 500 when accessing Chat.
  • Admin Console - Customers' admin consoles displayed generic logos rather than company specific logos.
  • Groups - Groups User Interface and Settings API served error 500 to all users

Google Cloud Product Impact:

  • Google Compute Engine - Customers experienced elevated API traffic error rates.
  • Google Cloud Console - Customers may have experienced generic unavailable errors.
  • Resource Manager - Customers may have noticed generic status unavailable errors.
  • Google BigQuery - Customers may have encountered error messages that state Policy Checks are unavailable.
  • Cloud Firestore - Customers may have experienced elevated latency, operation deadline expirations and/or error messages that state Policy Checks are unavailable.

14 Nov 2022 12:36 PST

The issue with Access Context Manager, Cloud Logging, Google BigQuery, Google Cloud Bigtable, Google Cloud Console, Google Cloud Storage, Google Compute Engine, Identity and Access Management has been resolved for all affected users as of Monday, 2022-11-14 11:38 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

14 Nov 2022 12:35 PST

Summary: We are investigating a potential issue with multiple Google Cloud services.

Description: Mitigation work is currently underway by our engineering team.

We do not have an ETA for mitigation at this point.

We will provide more information by Monday, 2022-11-14 17:21 US/Pacific.

Diagnosis: None at this time.

Workaround: None at this time.