Service Health
Incident affecting Google Compute Engine, Google Kubernetes Engine, Google Cloud Bigtable, Persistent Disk, Google Cloud SQL, Cloud Filestore, AlloyDB for PostgreSQL, Apigee, Artifact Registry, Cloud Armor, Cloud Key Management Service, Cloud Load Balancing, Cloud Logging, Cloud Memorystore, Cloud NAT, Cloud Run, Cloud Spanner, Database Migration Service, Datastream, Eventarc, Google BigQuery, Google Cloud Composer, Google Cloud Console, Google Cloud Dataflow, Google Cloud Dataproc, Google Cloud DNS, Google Cloud Identity-Aware Proxy, Google Cloud Pub/Sub, Google Cloud Storage, Identity and Access Management, Managed Service for Microsoft Active Directory (AD), Secret Manager, Service Directory, Traffic Director, Vertex AI Batch Prediction, Virtual Private Cloud (VPC), Memorystore for Memcached, Memorystore for Redis
Multiple Google Cloud services in the europe-west9-a zone are impacted
Incident began at 2023-04-25 16:46 and ended at 2023-04-26 20:00 (all times are US/Pacific).
Previously affected location(s)
Paris (europe-west9)Global
Date | Time | Description | |
---|---|---|---|
| 23 Jun 2023 | 09:12 PDT | Incident ReportSummaryTo our customers whose businesses were impacted during this outage, we sincerely apologize. This is not the level of quality and reliability we strive to offer, and we are taking immediate steps to improve the platform’s performance and resilience. Water Leak and Fire DamageOn Tuesday, 25 April 2023 at 16:46 US/Pacific, a cooling system water pipe leak occurred in one of the data centers in the europe-west9 region. The leak originated in a non-Google portion of the facility, entered an associated uninterruptible power supply (UPS) room, and led to a fire. The fire required evacuation of the facility, engagement from the local fire department, and a power shutdown of the entire data center building for several hours. The fire was successfully controlled on 26 April 2023 at 04:11 US/Pacific. Regional ImpactEurope-west9 contains three buildings with independent cooling, power, and networking. Regional Spanner supports the regional control plane for most Google Cloud services in the region. In europe-west9, regional Spanner’s replicas were not correctly distributed across the three buildings in the region to ensure quorum was available after the impacted building was powered down. As a result, the control plane services for many Google Cloud services were interrupted in europe-west9, causing unavailability and/or elevated error rates for these services in this region. Regional Spanner was restored on 26 April 2023 at 12:47 US/Pacific. The Identity and Authentication Management (IAM) service in europe-west9 depends upon regional Spanner. After regional Spanner came back online, IAM recovered on 26 April 2023 at 14:42 US/Pacific. IAM recovery took longer than expected, driven by the time to refresh the policies for the region as a result of the outage. Some Google Cloud services that were impacted by this event began recovering soon after Spanner and IAM were back online. Google Compute Engine (GCE) and Persistent Disk (PD) experienced regional impact in both the data plane (GCE Virtual Machines (VMs), PD Volumes) and control plane systems. Both systems had extended recovery periods after Spanner and IAM were restored. GCE VMs and PDs in europe-west9 zones were online by 19:00 US/Pacific, except for 58% of GCE VMs in zone europe-west9-a, which were directly impacted by the water leak. Soon after Spanner, IAM, GCE/PD were online, the europe-west9 region was primarily recovered and operational. Global Service ImpactGoogle Compute Engine (GCE) control plane provides the underlying control plane for Google Cloud Platform. A small number of methods within the GCE control plane API must collect information from multiple regions or zones by making requests to each regional control plane (called fanout requests). Google Cloud services including Cloud Console depend on these methods. When the GCE control plane for the europe-west9 region and zones went offline, some of these fanout methods did not operate correctly. During the outage, this led to global unavailability for some pages and control plane operations within Cloud Console. The global impact that was observed due to fanout issues in the Google Cloud Console control plane was mitigated by removing the europe-west9 control plane from the set of visible regions on Wednesday, 26 April 2023 at 03:38 US/Pacific. Key Events and TimelineIncluded below is a brief timeline of the overall incident. Additional details are provided in the following sections. All times and dates are listed in the US/Pacific time zone. Tuesday, 25 April 202316:46 - Water leakage began on Level 2 above europe-west9-a zone. 17:31 - Google engineers were proactively alerted about a potential critical impact to one cluster (a collection of servers along with supporting equipment for networking, power, and cooling) supporting our europe-west9-a zone due to water leak in its data center room. 18:30 - Google started to gradually bring down Google Compute Engine (GCE) virtual machine instances along with Persistent Disk (PD) volumes from this cluster supporting the europe-west9-a zone. 18:37 - As a preventative measure, engineers decided to preemptively power down all equipment supporting one of the clusters in europe-west9-a. 19:14 - Smoke was detected in the building with the water leakage. 19:42 - Completed power down of the impacted cluster with the water leak. 58% percent of GCE instances and 55% of PD volumes taken out of service from europe-west9-a zone. 19:45 - The local fire department was called. 20:00 - Fire department arrived on site and took over incident command of the building fire. 20:08 - Google engineers were alerted to the battery room water leak and fire. 22:45 - Building fire suppression system water supply ran out. 22:51 - Google engineers notified that the fire department has ordered building power to be shut down. 23:04 - Emergency shutdown commenced for one of the three buildings in europe-west9 on order of the fire marshal. At this time regional impact occurred because Regional Spanner lost quorum. The global outage begins for some fanout Google Cloud APIs and Google Cloud Console operations across a number of services. Wednesday, 26 April 202300:52 - Battery Room Fire Suppression System Restored 03:38 - Google engineers removed the europe-west9 region from the list of available regions for Compute Engine. This restored services for Google Cloud Console globally and resumed instance creation for customers whose projects use global scope internal DNS. Global instance creation for Google Cloud SQL is also restored. 04:11 - Battery room fire reported to be extinguished. 10:00 - Availability of global dataset/job list operations is restored for BigQuery customers with data in europe-west9 by excluding the europe-west9 region. 12:47 - Regional Spanner quorum is restored and internal Spanner operations resume. 13:00 - Fire department confirmed the battery temperatures were stabilized, and the majority of the fire crew departed. 14:42 - Regional IAM services are restored. 15:14 - Google Cloud Storage (GCS) regional services are restored. 15:30 - Cloud Pub/Sub services are restored. 17:10 - Google BigQuery service is fully restored for europe-west9 region. 18:40 - Google Cloud Dataflow services are restored. 19:00 - Google Compute Engine Control plane services restored across all zones. For the cluster in europe-west9-a that was not affected by the water leak, Google Compute Engine instances and Persistent Disk volumes were restored. This resulted in 42% of instances and volumes in zone europe-west9-a being restored. The remaining cluster in europe-west9-a that was affected by the water leak also had smoke damage. The severity of the smoke damage led to the whole cluster being decommissioned. Before decommissioning the cluster, we cleaned it and fully recovered all zonal data from the cluster. After Spanner, IAM and GCE/PD had recovered at 19:00, the europe-west9 region was primarily recovered and operational. Water Leak, Mitigation, and PreventionOn Tuesday, 25 April 2023 at 17:31 US/Pacific, Google engineers were proactively alerted about a potential critical impact to the europe-west9-a cloud zone and immediately started an investigation. Our Europe regional Security Operations Center was immediately engaged, and it was discovered that water had leaked from a non-Google room above onto the first floor of our facility for one of the clusters in zone europe-west9-a. As a preventative measure, engineers decided to preemptively power down all equipment supporting this one cluster with water leak in europe-west9-a at 18:37 US/Pacific. At 19:14 US/Pacific, smoke was detected in the building and at 19:45 US/Pacific, the local fire department was called. When they arrived onsite at 20:00 US/Pacific, the fire marshal informed Google engineers that a full building shutdown was imminent to allow for fire fighting activities. Shortly thereafter, at 20:08 US/Pacific, we were alerted that a battery room had also flooded, and the room had caught fire. At 22:45 US/Pacific, we were informed that the fire suppression system had run out of water. Once firefighters arrived on site, they began the process of connecting the battery room suppression system to the city fire water system. At 22:59 US/Pacific, upon notification from the local fire marshal that a full building shut down was imminent, Google engineers proactively started to migrate the workloads from the building and completely powered it down at 23:29 US/Pacific. At this time regional impact occurred because regional Spanner no longer had quorum, due to a misconfiguration. It had two of its quorum in the building powered down, instead of being correctly spread across the three buildings in the europe-west9 region. At this time, some global APIs that aggregate information from each region by making a request to each regional control plane (called fanout requests) began to fail. The local fire department continued to fight the fire in the UPS battery room on the lower floor of the facility and to control the fire. The facility fire continued to pose a threat of re-igniting until Wednesday, 26 April 2023 at 11:00 US/Pacific. Unfortunately, it took nearly 16 hours to extinguish the fire in the battery room, which delayed recovery efforts for the impacted building. Once it was clear to the Google Cloud team that power and cooling would not be further affected by the events impacting europe-west9-a, Google started working to bring back the clusters in the impacted building. The cluster in zone-a that had water intrusion also had smoke damage and continued to remain powered off and access was restricted for safety and security reasons. The fire introduced water and soot contamination to the data center space for this cluster in zone europe-west9-a. The affected racks of servers in the cluster supporting europe-west9-a had to be taken apart, thoroughly cleaned, and re-assembled before they could be powered on again for recovery of the zonal data before decommissioning the cluster. All equipment was thoroughly cleaned using professional cleaning services. These services also began pumping out the standing water on Saturday, 29 April 2023. The cleaning efforts included reducing high humidity levels, clearance of rubble from the facility battery room, cleaning of all physical equipment, and cleaning of greasy soot contaminations (which was observed throughout the impacted cluster in zone europe-west9-a). After the cluster was cleaned we then performed a staged power-on approach for this cluster in zone europe-west9-a to recover customer zonal data. After a successful multi-day power restoration process, Google engineers were able to power on all the equipment in the cluster and recovered all zonal data in europe-west9-a. Customers experienced no data loss from the incident. Google has controls in place to detect and mitigate the risk of fire safety issues. These controls are audited by both internal teams with due diligence and operational reviews and by Google Cloud’s external auditors. These control objectives apply to all Google Cloud regions in facilities operated by third parties. Google Cloud regions (whether operated by Google Cloud or third parties) are also audited against Business Continuity Management System (BCMS) controls in place that support their ISO 22301 certification [1]. In addition, we are executing the following actions to prevent a recurrence of this issue
[1] - https://cloud.google.com/security/compliance/iso-22301 Google Cloud Services on the critical path for restoring RegionRegional SpannerGoogle uses an internal version of regional Spanner as a back-end database to several Google Cloud services such as IAM and various control planes that manage our infrastructure and services for a region. The outage had a regional impact as this regional Spanner was not configured correctly across the three buildings in the region for it to maintain its quorum. Regional Spanner should have had one replica in each of the three buildings in the region. Instead, it had two of its three replicas in two different clusters in the building that was powered down. As a result, regional Spanner no longer met quorum across the region impacting IAM and the control planes in the region. On Wednesday, 26 April 2023 at 12:47 US/Pacific, Regional Spanner’s quorum was restored, which then allowed IAM and the region's control planes to begin recovery. We are currently conducting a detailed per-region audit (and conducting any required remediation if needed) of our internal regional Spanner allocations to confirm all regions fully meet Google Cloud expectations for fault isolation to prevent this issue in the future. IAMRegional IAM needed regional Spanner available for it to serve and process requests. After regional Spanner was brought back to service, we kept the Cloud IAM in this region offline while it sync’d to get up to date with the global view of Cloud IAM. Most customers’ policies enforced by IAM are global in nature spanning identities, folders, and projects. These policies had received no updates for 13 hours in this region and any security critical updates made by customers during this period would not be available there if we had brought it online without syncing. For example, if a user was taken out of a critical IAM role for a storage bucket, the IAM server in the region would not yet know about it. Out of an abundance of caution, and treating our customers’ security as paramount, we decided to not serve traffic for IAM in europe-west9 until it completed full synchronization, and began serving traffic with up-to-date policies at 14:42 US/Pacific. To address this recovery delay, we are implementing changes to our IAM policy refresh mechanism to ensure any stale policies (as a result of outages) are synchronized and resolved in less than 15 minutes. We are achieving this by providing improvements in replicating policies and parallelizing the synchronization when recovery is needed. Google Compute Engine (GCE) and Persistent Disk (PD)Google Compute Engine (GCE) and Persistent Disk (PD) experienced regional impact in both the data plane (GCE VMs, PD Volumes) and control plane systems. Both systems had extended recovery periods after Spanner and IAM were restored. Regional Persistent Disk (PD) replicates volumes across multiple zones. To ensure proper quorum, Regional PD maintains a lease for each volume in a regional Spanner database. No Regional PD volumes had both replicas in the same building within the region. However, 79% of regional Persistent Disk volumes in the europe-west9 region had one replica in the impacted building. When Spanner was offline in the region, these volumes could not establish quorum, and therefore became read-only as a failsafe measure. If Regional Spanner had not gone offline, Regional Persistent Disk (PD) would have ensured that the Regional PD volumes with one replica would work as expected (as detailed in the RePD documentation[2]) to enable customer workloads to continue operating in the available zone. Spanner and IAM were online by 26 April 2023 at 14:42 US/Pacific. Excluding the 58% of GCE VMs in europe-west9-a directly impacted by the water leak, the VMs and PDs in europe-west9-a were online by 19:00 US/Pacific. This time was to validate Persistent Disk integrity, to ensure that all volumes were ready for production operations before customer VMs were restarted, and then re-enable persistent disk operations. This took longer than expected due to a suspected issue with the PD health check verifier that reported errors. These were false positives due to an issue in starting up this many PDs at once. As soon as the team confirmed that the PD volumes were healthy we initiated the recovery process. To ensure we don’t have these additional delays for PD in the future we are:
The GCE control plane experienced a regional impact due to a dependence on regional Spanner. After regional Spanner recovered, the recovery time of the GCE control plane was extended due to the time required to revert service configuration changes that were made as part of the initial mitigation of global control plane impact. During the incident, the GCE control plane experienced some fanout API issues. To mitigate this impact, the GCE control plane was reconfigured on 26 April 2023 at 03:38 US/Pacific to remove the europe-west9 region and zones. When the IAM service was restored, this configuration change was reverted. Restoring europe-west9 to the set of active GCE regions required two sequential configuration changes. Google Cloud configuration changes of this nature are deployed gradually with an incremental rollout. The first configuration change started at 14:53 US/Pacific and completed at 15:33 US/Pacific. The second configuration change, which brought the europe-west9 region and zones back online, started at 15:46 US/Pacific and finished with the last zone at 19:00 US/Pacific. At this point the GCE control plane was fully operational. To improve the mitigation and recovery time in the future GCE will:
[2] - https://cloud.google.com/compute/docs/disks#repds Enhanced Zonal & Regional Failure TestingWe are adding improvements to our zonal and regional environment to test for and ensure system behavior is as intended across all services in the case of zonal and regional outages. This includes validating that critical services consistently meet our expectations for how they allocate resources within our physical failure domains, that automated verification tests are in place to ensure no regressions, and ensure recovery times are efficient. Global Google Cloud Impact and RestorationCloud Console provides a global view of Google Cloud resources. Certain Cloud Console pages rely on GCE backend APIs to perform global aggregated fanout across the regions. Some of these APIs failed to operate correctly while europe-west-9 was unavailable, rather than returning partial data. The global impact that was observed due to fanout issues in the Google Cloud Console control plane was mitigated by removing the europe-west9 control plane from the set of visible regions on Wednesday, 26 April 2023 at 03:38 US/Pacific. We are updating the Cloud Console so that fanout API methods continue to operate as expected to prevent this issue from occuring again. In addition, we are performing an audit (and any required remediation) across our services for any instances of fanout methods that could have similar issues. GCE provides an Internal DNS [3] feature to enable customers to look up instance names within their project. This service can be configured to either register instance names in a single Global namespace or in multiple Zonal namespaces. During the outage, customers using the Global namespace for Internal DNS for VM creation were unable to complete if the project had VMs in europe-west9, since it requires coordination with all zones where the project has resources to avoid duplicate instance DNS names. Customers using Zonal namespace for Internal DNS did not experience this impact. It is recommended to use Internal DNS with Zonal namespaces for your project (please see the documentation [3] for details). [3] - https://cloud.google.com/compute/docs/internal-dns Mitigation & Prevention of Additional ServicesAfter Regional Spanner, IAM and GCE/PD came back online for the region, almost all of the services in the region fully recovered soon after that. The following services either took longer to recover or had additional impact during the outage.
Please log a case with our Support team if you would like to receive additional information regarding these services. We thank you for your patience while we worked on resolving the outage and completing our investigation. We apologize to our customers whose businesses were impacted during this outage. This is not the level of quality and reliability we strive to offer, and we are taking immediate steps outlined in this IR and more to improve our platform’s performance and resilience. |
| 27 Apr 2023 | 06:39 PDT | Summary: Multiple Google Cloud services in the europe-west9-a zone are impacted Description: Water intrusion in a data center in europe-west9 caused a multi-cluster failure that led to a shutdown of multiple zones. Impact is now limited to services in europe-west9-a. There is no ETA for full recovery of operations in europe-west9-a at this time. We expected to see extended outages for some services. Customers are advised to failover to other zones/regions if they are impacted. The following services have fully recovered in europe-west9: Google Cloud Storage (GCS) Cloud Key Management Service (KMS) Cloud Identity and Access Management (IAM). The following services have recovered in europe-west9-b and europe-west9-c, but continue to be impacted in europe-west9-a: Google Compute Engine (GCE) Cloud Run Google Cloud Load Balancer (GCLB) DataProc Cloud SQL We will provide an update by Thursday, 2023-04-27 11:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 27 Apr 2023 | 02:30 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 caused a multi-cluster failure that led to a shutdown of multiple zones. We expect some unavailability in the europe-west9 region. There is no current ETA for full recovery of operations in the europe-west9 region at this time. We expected to see an extended outage for some services. Customers are advised to failover to other regions if they are impacted. The following services have fully recovered in europe-west9: Google Cloud Storage (GCS) Cloud Key Management Service (KMS) Cloud Identity and Access Management (IAM) Google Kubernetes Engine (GKE) The following services have recovered in europe-west9-b and europe-west9-c, but continue to be impacted in europe-west9-a: Google Compute Engine (GCE) Cloud Run Google Cloud Load Balancer (GCLB) DataProc Cloud SQL Cloud Console: Experienced a global outage, which has been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Global Control Plane: Experienced a global outage, which has been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT to 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). A secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: The issue with Cloud Pub/Sub has been resolved for all affected users as of Wednesday, 2023-04-26 16:07 S/Pacific. For more information please see here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: The issue with Google BigQuery has been resolved for all affected users as of Wednesday, 2023-04-26 17:05 US/Pacific. For more information please see here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Thursday, 2023-04-27 07:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 27 Apr 2023 | 01:54 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 caused a multi-cluster failure that led to a shutdown of multiple zones. We expect some unavailability in the europe-west9 region. There is no current ETA for full recovery of operations in the europe-west9 region at this time. We expected to see an extended outage for some services. Customers are advised to failover to other regions if they are impacted. The following services have fully recovered in europe-west9: Google Cloud Storage (GCS) Cloud Key Management Service (KMS) Cloud Identity and Access Management (IAM) Google Kubernetes Engine (GKE) The following services have recovered in europe-west9-b and europe-west9-c, but continue to be impacted in europe-west9-a: Google Compute Engine (GCE) Cloud Run Google Cloud Load Balancer (GCLB) DataProc Cloud SQL Cloud Console: Experienced a global outage, which has been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Global Control Plane: Experienced a global outage, which has been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT to 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). A secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: For information related to ongoing Cloud Pub/Sub impact, please see the latest status here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: The issue with Google BigQuery has been resolved for all affected users as of Wednesday, 2023-04-26 17:05 US/Pacific. For more information please see here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Thursday, 2023-04-27 03:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 26 Apr 2023 | 22:49 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 caused a multi-cluster failure that led to a shutdown of multiple zones. We expect some unavailability in the europe-west9 region. There is no current ETA for full recovery of operations in the europe-west9 region at this time. We expected to see an extended outage for some services. Customers are advised to failover to other regions if they are impacted. The following services have fully recovered in europe-west9: Google Cloud Storage (GCS) Cloud Key Management Service (KMS) Cloud Identity and Access Management (IAM) Google Kubernetes Engine (GKE) The following services have recovered in europe-west9-b and europe-west9-c, but continue to be impacted in europe-west9-a: Google Compute Engine (GCE) Cloud Run Google Cloud Load Balancer (GCLB) DataProc Cloud SQL Cloud Console: Experienced a global outage, which has been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Global Control Plane: Experienced a global outage, which has been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT to 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). A secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: For information related to ongoing Cloud Pub/Sub impact, please see the latest status here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: The issue with Google BigQuery has been resolved for all affected users as of Wednesday, 2023-04-26 17:05 US/Pacific. For more information please see here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Thursday, 2023-04-27 03:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 26 Apr 2023 | 19:52 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Description: Water intrusion in a datacenter in europe-west9 has caused a multi-cluster failure and has led to a shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. Cloud Console: Experienced a global outage, which has been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Global Control Plane: Experienced a global outage, which has been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT to 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). A secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: For information related to ongoing Cloud Pub/Sub impact, please see the latest status here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: For information related to ongoing BigQuery impact, please see the latest status here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Thursday, 2023-04-27 00:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 26 Apr 2023 | 16:01 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 has caused a multi-cluster failure and has led to a shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. Cloud Console: Experienced a global outage, which has been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Global Control Plane: Experienced a global outage, which has been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT to 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). A secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: For information related to ongoing Cloud Pub/Sub impact, please see the latest status here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: For information related to ongoing BigQuery impact, please see the latest status here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Wednesday, 2023-04-26 20:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 26 Apr 2023 | 11:59 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 has caused a multi-cluster failure and has led to a shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. Cloud Console: Experienced global outages which have been mitigated. Management tasks should be operational again for operations outside the affected region (europe-west9). Primary impact was observed from 2023-04-25 23:15:30 PDT to 2023-04-26 03:38:40 PDT. GCE Control Plane: Experienced global outages, which have been mitigated. Primary impact was observed from 2023-04-25 23:15:20 PDT and 2023-04-26 03:45:30 PDT and impacted customers utilizing Global DNS (gDNS). Secondary global impact for aggregated list operation failures for customers with resources in europe-west9 has also been mitigated. Please see migration guide for gDNS to Zonal DNS for more information: https://cloud.google.com/compute/docs/internal-dns#migrating-to-zonal Cloud Pub/Sub: For information related to ongoing Cloud Pub/Sub impact, please see the latest status here: https://status.cloud.google.com/incidents/j6LfsjxCXhVDjmGGPhS7#2c2sBHWU84yPDJ8y1ar4 BigQuery: For information related to ongoing BigQuery impact, please see the latest status here: https://status.cloud.google.com/incidents/TbcwMSkKy8MTmeeEiqaq#scTMecZFsPpiygYrQ9sG We will provide an update by Wednesday, 2023-04-26 16:00 US/Pacific, or upon any significant development. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions. |
| 26 Apr 2023 | 07:01 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 has caused a multi-cluster failure and has led to a shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 12:00 US/Pacific, or upon any significant development. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 06:58 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in a data center in europe-west9 has caused a multi-cluster failure and has led to a shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 12:00 US/Pacific, or upon any significant development. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 05:52 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. For management tasks Cloud Console should be operational again for operations outside the affected region (europe-west9). We will provide an update by Wednesday, 2023-04-26 07:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 05:10 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. For management tasks Cloud Console should be operational again for operations outside the affected region (europe-west9) We will provide an update by Wednesday, 2023-04-26 06:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 05:07 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 06:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 04:51 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 06:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions Customer should use gcloud commands instead of Cloud Console for management tasks |
| 26 Apr 2023 | 03:39 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 05:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Workaround: Customers can failover to zones in other regions Customer should use gcloud commands instead of Cloud Console for management tasks |
| 26 Apr 2023 | 02:42 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 04:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Customer using Cloud Console globally are unable to open and view the Compute Engine related pages like: Instance creation page Disk creation page Instance templates page Instance Groups page Workaround: Customers can failover to zones in other regions Customer should use gcloud commands instead of Cloud Console for management tasks |
| 26 Apr 2023 | 02:25 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 03:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region. Customer using Cloud Console globally are unable to open and view the Compute Engine related pages like: Instance creation page Disk creation page Instance templates page Instance Groups page Workaround: Customers can failover to zones in other regions Customer should use gcloud commands instead of Cloud Console for management tasks |
| 26 Apr 2023 | 01:50 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 03:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region Workaround: Customers can failover to zones in other regions |
| 26 Apr 2023 | 00:35 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted. Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 02:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region Workaround: Customers can failover to zones in other regions |
| 25 Apr 2023 | 23:05 PDT | Summary: Multiple Google Cloud services in the europe-west9 region are impacted Description: Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones. We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage. Customers are advised to failover to other regions if they are impacted. We will provide an update by Wednesday, 2023-04-26 00:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9 region Workaround: Customers can failover to zones in other regions |
| 25 Apr 2023 | 22:21 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: Water intrusion in europe-west9-a led to an emergency shutdown of some hardware in that zone. There is no current ETA for recovery of operations in europe-west9-a, but it is expected to be an extended outage. Customers are advised to fail over to other zones if they are impacted. We will provide an update by Wednesday, 2023-04-26 00:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones |
| 25 Apr 2023 | 22:18 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: Water intrusion in europe-west9-a led to an emergency shutdown of some hardware in that zone. There is no current ETA for recovery of operations in europe-west9-a, but it is expected to be an extended outage. Customers are advised to fail over to other zones in europe-west9 if they are impacted. We will provide an update by Wednesday, 2023-04-26 00:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones within europe-west9 |
| 25 Apr 2023 | 22:16 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: Water intrusion in europe-west9-a led to an emergency shutdown of some hardware in that zone. There is no current ETA for recovery of operations in europe-west9-a, but it is expected to be an extended outage. Customers are advised to fail over to other zones in europe-west9 if they are impacted. We will provide an update by Wednesday, 2023-04-26 00:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other available zones |
| 25 Apr 2023 | 20:51 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: Water intrusion in europe-west9-a led to an emergency shutdown of some hardware in that zone. There is no current ETA for recovery of operations in europe-west9-a, but it is expected to be an extended outage. Customers are advised to fail over to other zones in europe-west9 if they are impacted. We will provide an update by Tuesday, 2023-04-25 22:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones within europe-west9 |
| 25 Apr 2023 | 19:56 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Our engineering team continues to investigate the issue. We will provide an update by Tuesday, 2023-04-25 21:00 US/Pacific with current details. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones within europe-west9 |
| 25 Apr 2023 | 19:25 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Our engineering team continues to investigate the issue. We will provide an update by Tuesday, 2023-04-25 20:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones within europe-west9 |
| 25 Apr 2023 | 19:00 PDT | Summary: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Description: We are investigating an issue affecting multiple Cloud services in the europe-west9-a zone Our engineering team continues to investigate the issue. We will provide an update by Tuesday, 2023-04-25 19:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones within europe-west9 |
- All times are US/Pacific