Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Hybrid Connectivity

Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Incident began at 2023-08-14 16:11 and ended at 2023-08-15 11:22 (all times are US/Pacific).

Previously affected location(s)

Montréal (northamerica-northeast1)

Date Time Description
17 Aug 2023 13:40 PDT

Incident Report

Summary

On Monday, 14 August 2023 starting at 16:11 US/Pacific, Google Cloud Interconnect experienced an outage with underground fiber cables in Montreal, Canada. Customers in the impacted location experienced availability issues and were unable to access Google Cloud services via physical connections for a period of 19 hours and 11 minutes. To our customers whose businesses were impacted during this outage, we sincerely apologize. This is not the level of quality and reliability we strive to offer, and we are taking immediate steps to improve the platform’s performance and availability.

Root Cause

The outage occurred due to a fire in an underground, third-party owned cable chamber that affected a larger leased fiber cable below street level, impacting the availability of the only physical path to the Edge Availability Domains (EADs) yul-zone1-1944 and yul-zone2-1944. The fire affected a fiber conduit that is leveraged by Google to provide inter-zone connectivity.

At the time of the fire, this was the only physical path providing interconnect to customers from these EADs. This is not the expected redundancy.

The Edge Availability Domains, yul-zone1-1944 and yul-zone2-1944, are planned and built for full redundancy of fiber connections back to the Google backbone network, with no single points of failure in equipment, and fiber paths are verified to be physically diverse along the whole route between Google network locations.

The EADs were operating without a usable redundant path due to previous fiber cuts that remained unrepaired. The repairs were not completed due to prematurely updated documentation in our systems to reflect our plans for future migrations, which caused us to believe the inactive fiber status was intentional.

Other than the impact to Cloud Interconnect, the northamerica-northeast1 (YUL) region was unaffected.

Remediation and Prevention

Google engineers were alerted to the outage via our internal systems on 14 August, 16:11 US/Pacific and immediately started an investigation.

Google engineering teams worked with the third-party fiber vendor and restored service for Cloud Interconnect location yul-zone1-1944, on Tuesday, 15 August 2023 at 01:06 US/Pacific. Service for yul-zone2-1944, was restored on Tuesday, 15 August at 11:22 US/Pacific.

The fire-damaged fiber could not be repaired in place, so the damaged underground cable chamber had to be bypassed physically. This required splicing several hundred fiber links, including the ones leased by Google, which extended the repair timeline.

Simultaneously, Google attempted to return to service the redundant physical path not affected by the fire. However, this was not immediately successful, and the incident was resolved by the repair of the primary path.

Google restored full redundancy for both EADs (both primary and redundant paths) on 16 August, at 14:55 US/Pacific.

Google Cloud Interconnect offers a 99.99% reliability SLA, for customers who deploy at least two connections in each of two interconnect regions. Customers that had deployed this architecture were not affected by this outage. For more information on this feature, see the following: Topology for production-level applications overview

Additionally, Google is committed to preventing future recurrence by taking the following actions:

  • Improve rules within Google's internal repair prioritization system so that any issue that results in a loss of redundancy for more than one EAD in a region is treated at the highest repair priority, where Google's target is a 24 hour return to service.
  • Ensure that exceptions that occur between the migration and repair process for fibers have a documented escalation process, and that migration processes do not change fiber documentation until immediately after the migration is complete.
  • Complete the migration (both in Montreal and elsewhere) that moves all Interconnect zones to fully active-active connections. This builds upon our existing standard of "no single failure can affect both EADs in a region" to a stronger commitment of "no single failure can affect any EAD".
  • Ensure that clear notifications for isolation outages are sent proactively to customers.

Detailed Description of Impact

  • Starting on 14 August, 16:11 US/Pacific, 100% of customers using Cloud Interconnect in Montreal, Canada experienced unavailability, with customers unable to access Google Cloud services via physical connections in that location.
  • The interconnect location “yul-zone1-1944” was restored 10 hours, 16 minutes before yul-zone2-1944, allowing customers with redundant connectivity to access services via yul-zone1-1944 until yul-zone2-1944 was restored.
  • Other Google Cloud services in northamerica-northeast1 were accessible from interconnects outside of Montreal, Canada.

17 Aug 2023 08:36 PDT

Google restored full redundancy for both Edge Availability Domains (both primary and redundant paths) on 16 August, at 14:55 US/Pacific

We will complete a full IR in the following days that will provide additional details.


15 Aug 2023 15:42 PDT

Mini Incident Report

We apologize for the inconvenience this service outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support.

(All Times US/Pacific)

yul-zone1-1944 Incident Start: 14 August 2023, 16:11

yul-zone1-1944 Incident End: 15 August 2023, 01:06

Duration: 8 hours, 55 minutes

yul-zone2-1944 Incident Start: 14 August 2023, 16:11

yul-zone2-1944 Incident End: 15 August 2023, 11:22

Duration: 19 hours, 11 minutes

Affected Services and Features:

Google Cloud Interconnect.

Regions/Zones: Montreal, Canada Interconnect locations.

Description:

Google Cloud Interconnect experienced an outage with underground fiber cables in Montreal, Canada. The customer impact during the outage was due to two unrelated issues:

The Edge Availability Domains - yul-zone1-1944 and yul-zone2-1944 were operating without a usable redundant path due to a previous fiber cut that remained unrepaired. The repairs were not completed due to prematurely updating the documentation in our systems to reflect our plans for future migrations.

The outage occurred when a fire broke out in an underground third-party owned cable chamber, which affected a large leased fiber cable below street level, impacting the availability of the only remaining physical path to these Edge Availability Domains. The fire-affected large fiber cable is leveraged by Google to provide inter-zone connectivity. At the time of the fire this was the only physical path between the affected routers providing interconnect to customers.

Google engineering teams worked with the third party fiber vendor and restored service for Cloud Interconnect location yul-zone1-1944 on Tuesday, 2023-08-15 at 01:06 US/Pacific and restored service for yul-zone2-1944 on Tuesday, 2023-08-15 at 11:22 US/Pacific.

The fire-damaged fiber could not be repaired in place, so the damaged underground cable chamber had to be bypassed physically. This required splicing over a few hundred fiber links, including the ones leased by Google, which extended the repair timeline.

Simultaneously, Google attempted to return to service the redundant physical path not affected by the fire, however, this was not immediately successful before splicing efforts completed, resolving the outage. Google is continuing to work with third party vendors to establish all redundant physical paths.

Google will complete a full IR in the following days that will provide additional details.

Customer Impact:

  • Customers using Cloud Interconnect in Montreal, Canada could have experienced availability issues, with customers unable to access Google Cloud services via physical connections in that location. The interconnect location “yul-zone1-1944” was restored 10 hours, 16 minutes before yul-zone2-1944, allowing customers with redundant connectivity to access services via yul-zone1-1944 until yul-zone2-1944 was restored.

  • Google Cloud services in northamerica-northeast1 were accessible from interconnects outside of Montreal, Canada.


15 Aug 2023 12:32 PDT

The issue with Google Cloud Interconnect in Montreal, Canada has been resolved for all affected users as of Tuesday, 2023-08-15 11:22 US/Pacific.

We will publish an analysis of this incident once we have completed our internal investigation.

We thank you for your patience while we worked on resolving the issue.

15 Aug 2023 11:46 PDT

Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Description: Google Cloud Interconnect locations " yul-zone1-1944" and " yul-zone2-1944" in Montreal, Canada are now fully restored. Internal monitoring shows traffic on the physical connections.

Our Engineering team is continuing to closely monitor the traffic.

We will provide an update by Tuesday, 2023-08-15 12:30 US/Pacific.

Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location.

Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada.

Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations.

15 Aug 2023 09:57 PDT

Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected.

Service restoration for interconnect location yul-zone1-1944 was completed on Tuesday, 2023-08-15 01:06 US/Pacific.

Our engineering team is continuing to work with the hardware vendor for service restoration in yul-zone2-1944

We do not have an ETA for complete mitigation at this point.

We will provide an update by Tuesday, 2023-08-15 12:00 US/Pacific.

Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location.

Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada.

Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations.

14 Aug 2023 22:27 PDT

Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected.

Mitigation work is underway by our engineering team.

We will provide an update by Tuesday, 2023-08-15 10:00 US/Pacific.

Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location.

Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada.

Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations.

14 Aug 2023 21:29 PDT

Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected.

Our engineering team continues to investigate the issue.

We will provide an update by Monday, 2023-08-14 23:00 US/Pacific.

Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location.

Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada.

Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations.

14 Aug 2023 20:33 PDT

Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.

Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected.

Our engineering team continues to investigate the issue.

We will provide an update by Monday, 2023-08-14 21:45 US/Pacific.

Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location.

Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada.

Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations.

14 Aug 2023 19:04 PDT

Summary: Google Cloud Networking is experiencing interconnect issues

Description: We are experiencing an issue with Google Cloud Networking.

Our engineering team continues to investigate the issue.

We will provide an update by Monday, 2023-08-14 21:30 US/Pacific with current details.

Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments.

Workaround: None at this time.

14 Aug 2023 18:43 PDT

Summary: Google Cloud Networking is experiencing interconnect issues

Description: We are experiencing an issue with Google Cloud Networking.

Our engineering team continues to investigate the issue.

We will provide an update by Monday, 2023-08-14 21:30 US/Pacific with current details.

Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments.

Workaround: None at this time.

14 Aug 2023 18:35 PDT

Summary: Google Cloud Networking is experiencing interconnect issues

Description: We are experiencing an issue with Google Cloud Networking.

Our engineering team continues to investigate the issue.

We will provide an update by Monday, 2023-08-14 19:00 US/Pacific with current details.

Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments.

Workaround: None at this time.