Service Health
Incident affecting Hybrid Connectivity
Google Cloud Interconnect in Montreal, Canada is experiencing availability issues.
Incident began at 2023-08-14 16:11 and ended at 2023-08-15 11:22 (all times are US/Pacific).
Previously affected location(s)
Montréal (northamerica-northeast1)
Date | Time | Description | |
---|---|---|---|
| 17 Aug 2023 | 13:40 PDT | Incident ReportSummaryOn Monday, 14 August 2023 starting at 16:11 US/Pacific, Google Cloud Interconnect experienced an outage with underground fiber cables in Montreal, Canada. Customers in the impacted location experienced availability issues and were unable to access Google Cloud services via physical connections for a period of 19 hours and 11 minutes. To our customers whose businesses were impacted during this outage, we sincerely apologize. This is not the level of quality and reliability we strive to offer, and we are taking immediate steps to improve the platform’s performance and availability. Root CauseThe outage occurred due to a fire in an underground, third-party owned cable chamber that affected a larger leased fiber cable below street level, impacting the availability of the only physical path to the Edge Availability Domains (EADs) yul-zone1-1944 and yul-zone2-1944. The fire affected a fiber conduit that is leveraged by Google to provide inter-zone connectivity. At the time of the fire, this was the only physical path providing interconnect to customers from these EADs. This is not the expected redundancy. The Edge Availability Domains, yul-zone1-1944 and yul-zone2-1944, are planned and built for full redundancy of fiber connections back to the Google backbone network, with no single points of failure in equipment, and fiber paths are verified to be physically diverse along the whole route between Google network locations. The EADs were operating without a usable redundant path due to previous fiber cuts that remained unrepaired. The repairs were not completed due to prematurely updated documentation in our systems to reflect our plans for future migrations, which caused us to believe the inactive fiber status was intentional. Other than the impact to Cloud Interconnect, the northamerica-northeast1 (YUL) region was unaffected. Remediation and PreventionGoogle engineers were alerted to the outage via our internal systems on 14 August, 16:11 US/Pacific and immediately started an investigation. Google engineering teams worked with the third-party fiber vendor and restored service for Cloud Interconnect location yul-zone1-1944, on Tuesday, 15 August 2023 at 01:06 US/Pacific. Service for yul-zone2-1944, was restored on Tuesday, 15 August at 11:22 US/Pacific. The fire-damaged fiber could not be repaired in place, so the damaged underground cable chamber had to be bypassed physically. This required splicing several hundred fiber links, including the ones leased by Google, which extended the repair timeline. Simultaneously, Google attempted to return to service the redundant physical path not affected by the fire. However, this was not immediately successful, and the incident was resolved by the repair of the primary path. Google restored full redundancy for both EADs (both primary and redundant paths) on 16 August, at 14:55 US/Pacific. Google Cloud Interconnect offers a 99.99% reliability SLA, for customers who deploy at least two connections in each of two interconnect regions. Customers that had deployed this architecture were not affected by this outage. For more information on this feature, see the following: Topology for production-level applications overview Additionally, Google is committed to preventing future recurrence by taking the following actions:
Detailed Description of Impact
|
| 17 Aug 2023 | 08:36 PDT | Google restored full redundancy for both Edge Availability Domains (both primary and redundant paths) on 16 August, at 14:55 US/Pacific We will complete a full IR in the following days that will provide additional details. |
| 15 Aug 2023 | 15:42 PDT | Mini Incident ReportWe apologize for the inconvenience this service outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support. (All Times US/Pacific) yul-zone1-1944 Incident Start: 14 August 2023, 16:11 yul-zone1-1944 Incident End: 15 August 2023, 01:06 Duration: 8 hours, 55 minutes yul-zone2-1944 Incident Start: 14 August 2023, 16:11 yul-zone2-1944 Incident End: 15 August 2023, 11:22 Duration: 19 hours, 11 minutes Affected Services and Features: Google Cloud Interconnect. Regions/Zones: Montreal, Canada Interconnect locations. Description: Google Cloud Interconnect experienced an outage with underground fiber cables in Montreal, Canada. The customer impact during the outage was due to two unrelated issues: The Edge Availability Domains - yul-zone1-1944 and yul-zone2-1944 were operating without a usable redundant path due to a previous fiber cut that remained unrepaired. The repairs were not completed due to prematurely updating the documentation in our systems to reflect our plans for future migrations. The outage occurred when a fire broke out in an underground third-party owned cable chamber, which affected a large leased fiber cable below street level, impacting the availability of the only remaining physical path to these Edge Availability Domains. The fire-affected large fiber cable is leveraged by Google to provide inter-zone connectivity. At the time of the fire this was the only physical path between the affected routers providing interconnect to customers. Google engineering teams worked with the third party fiber vendor and restored service for Cloud Interconnect location yul-zone1-1944 on Tuesday, 2023-08-15 at 01:06 US/Pacific and restored service for yul-zone2-1944 on Tuesday, 2023-08-15 at 11:22 US/Pacific. The fire-damaged fiber could not be repaired in place, so the damaged underground cable chamber had to be bypassed physically. This required splicing over a few hundred fiber links, including the ones leased by Google, which extended the repair timeline. Simultaneously, Google attempted to return to service the redundant physical path not affected by the fire, however, this was not immediately successful before splicing efforts completed, resolving the outage. Google is continuing to work with third party vendors to establish all redundant physical paths. Google will complete a full IR in the following days that will provide additional details. Customer Impact:
|
| 15 Aug 2023 | 12:32 PDT | The issue with Google Cloud Interconnect in Montreal, Canada has been resolved for all affected users as of Tuesday, 2023-08-15 11:22 US/Pacific. We will publish an analysis of this incident once we have completed our internal investigation. We thank you for your patience while we worked on resolving the issue. |
| 15 Aug 2023 | 11:46 PDT | Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues. Description: Google Cloud Interconnect locations " yul-zone1-1944" and " yul-zone2-1944" in Montreal, Canada are now fully restored. Internal monitoring shows traffic on the physical connections. Our Engineering team is continuing to closely monitor the traffic. We will provide an update by Tuesday, 2023-08-15 12:30 US/Pacific. Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location. Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada. Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations. |
| 15 Aug 2023 | 09:57 PDT | Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues. Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected. Service restoration for interconnect location yul-zone1-1944 was completed on Tuesday, 2023-08-15 01:06 US/Pacific. Our engineering team is continuing to work with the hardware vendor for service restoration in yul-zone2-1944 We do not have an ETA for complete mitigation at this point. We will provide an update by Tuesday, 2023-08-15 12:00 US/Pacific. Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location. Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada. Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations. |
| 14 Aug 2023 | 22:27 PDT | Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues. Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected. Mitigation work is underway by our engineering team. We will provide an update by Tuesday, 2023-08-15 10:00 US/Pacific. Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location. Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada. Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations. |
| 14 Aug 2023 | 21:29 PDT | Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues. Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected. Our engineering team continues to investigate the issue. We will provide an update by Monday, 2023-08-14 23:00 US/Pacific. Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location. Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada. Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations. |
| 14 Aug 2023 | 20:33 PDT | Summary: Google Cloud Interconnect in Montreal, Canada is experiencing availability issues. Description: We are experiencing an issue with physical connections in Google Cloud Interconnect in Montreal, Canada. Multiple Edge Availability Domains are affected. Our engineering team continues to investigate the issue. We will provide an update by Monday, 2023-08-14 21:45 US/Pacific. Diagnosis: Customers using Cloud interconnect in Montreal, Canada may experience availability issues, with customers not being able to access GCP services via physical connections in that location. Other GCP services in northamerica-northeast1 are not affected. The region should be accessible from interconnects outside of Montreal, Canada. Workaround: Customers with redundant interconnects in multiple peering locations can use global dynamic routing to failover their Cloud Interconnect traffic to those locations. |
| 14 Aug 2023 | 19:04 PDT | Summary: Google Cloud Networking is experiencing interconnect issues Description: We are experiencing an issue with Google Cloud Networking. Our engineering team continues to investigate the issue. We will provide an update by Monday, 2023-08-14 21:30 US/Pacific with current details. Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments. Workaround: None at this time. |
| 14 Aug 2023 | 18:43 PDT | Summary: Google Cloud Networking is experiencing interconnect issues Description: We are experiencing an issue with Google Cloud Networking. Our engineering team continues to investigate the issue. We will provide an update by Monday, 2023-08-14 21:30 US/Pacific with current details. Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments. Workaround: None at this time. |
| 14 Aug 2023 | 18:35 PDT | Summary: Google Cloud Networking is experiencing interconnect issues Description: We are experiencing an issue with Google Cloud Networking. Our engineering team continues to investigate the issue. We will provide an update by Monday, 2023-08-14 19:00 US/Pacific with current details. Diagnosis: Some Google Cloud Interconnect customers in the Montréal Canada area connecting to northamerica-northeast1 may observe link down on their attachments. Workaround: None at this time. |
- All times are US/Pacific