Service Health
Incident affecting Google BigQuery, Operations, Google Compute Engine, Google Cloud Bigtable, Cloud Logging, Google Cloud Storage, Google Cloud Console, Identity and Access Management, Access Context Manager, Cloud Firestore
We are investigating a potential issue with multiple Google Cloud services.
Incident began at 2022-11-14 10:50 and ended at 2022-11-14 11:38 (all times are US/Pacific).
Previously affected location(s)
Global
Date | Time | Description | |
---|---|---|---|
| 23 Nov 2022 | 09:41 PST | Incident ReportSummaryOn 14 November 2022, multiple Google Workspace and Google Cloud Platform (GCP) services experienced elevated error rates affecting a small percentage of customers for a duration of 38 minutes and 48 minutes respectively. To our Google Workspace and Google Cloud customers whose businesses were impacted during this outage, we sincerely apologize. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability. We have conducted an internal investigation and are taking steps to improve our service. Root CauseGoogle Workspace services (for eligible customers) rely on Context-Aware Access [1] and GCP services rely on Access Context Manager [2]. These are a unified set of infrastructure services (henceforth referred to as "Access Context Manager") that are responsible for determining what access levels a request has based on the level of trust a company's security policy places in that request. The customer must enable Access Context Manager. On 14 November at 09:25 PT, Google engineers were alerted to an issue in which some background tasks were experiencing resource exhaustion with Access Context Manager, which authorizes enhanced verification access for enterprise customers. During an initial attempt to resolve the issue, an inadvertent decrease in capacity for the service occurred. This caused authentication traffic to be directed to a small number of tasks, which were unable to handle the load. While internal safeguards successfully isolated the impact to a single data center, thus reducing the impact further, they were unable to stop the service tasks from repeatedly restarting or crashing. In the absence of access-level responses from the system, dependent Google Cloud and Google Workspace services acted correctly and fail-closed, as they are supposed to when they do not have a clear signal that access is permitted.
Remediation and PreventionGoogle engineers redirected traffic away from the affected Access Context Manager infrastructure at 11:19 PT and routed it to data centers that had available capacity to manage load. Service was restored by 11:28 PT for Workspace and 11:38 PT for Google Cloud. Google is committed preventing a repeat of this issue in the future and is completing the following actions:
Detailed Description of ImpactGoogle Workspace Product Impact: A small number of users of the following Google Workspace products experienced HTTP 500 Internal Server Errors: Calendar, Tasks, Gmail, Drive, Docs, Meet, Keep, Jamboard, and Chat between 10:50 PT to 11:28 PT. Consumer users were not affected.
Google Cloud Product Impact : A small number of users of the following GCP may have experienced increased error rates and unavailability between 10:50 PT to 11:38 PT in the following regions.
|
| 14 Nov 2022 | 22:03 PST | Mini Incident ReportWe apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213. (All Times US/Pacific) Google Workspace Impact Start: 14 November 2022 10:50 Google Workspace Impact End : 14 November 2022 11:27 Duration: 37 minutes Google Cloud Platform Impact Start: 14 November 2022 10:50 Google Cloud Platform Impact End: 14 November 2022 11:38 Duration: 48 minutes Description: Multiple Google Workspace and Google Cloud Platform services experienced elevated error rates for a duration of 37 minutes and 48 minutes respectively. From preliminary analysis, the root cause is an issue with Google’s system that authorizes access for enterprise customers. Customer Impact: Google Workspace Product Impact:
Google Cloud Product Impact:
|
| 14 Nov 2022 | 12:36 PST | The issue with Access Context Manager, Cloud Logging, Google BigQuery, Google Cloud Bigtable, Google Cloud Console, Google Cloud Storage, Google Compute Engine, Identity and Access Management has been resolved for all affected users as of Monday, 2022-11-14 11:38 US/Pacific. We thank you for your patience while we worked on resolving the issue. |
| 14 Nov 2022 | 12:35 PST | Summary: We are investigating a potential issue with multiple Google Cloud services. Description: Mitigation work is currently underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Monday, 2022-11-14 17:21 US/Pacific. Diagnosis: None at this time. Workaround: None at this time. |
- All times are US/Pacific