Google Cloud Service Health

Google Cloud Service Health
Incidents
Elevated error rate across all monitoring API endpoints globally

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

For incidents related to Google Security Products, visit https://status.cloud.google.com/security. For incidents related to Looker (original), visit https://status.cloud.google.com/looker.

Available
Service information
Service disruption
Service outage

Incident affecting Operations, Cloud Monitoring

Elevated error rate across all monitoring API endpoints globally

Incident began at 2021-08-06 14:25 and ended at 2021-08-06 20:53 (all times are US/Pacific).

Date	Time	Description
9 Aug 2021	09:02 PDT	We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support (All Times US/Pacific) Incident Start: 06 August 2021 14:25 Incident End: 06 August 2021 20:53 Duration: 6 hours, 28 minutes Affected Services and Features: Google Cloud Monitoring Regions/Zones: Global Description: Google Cloud Monitoring experienced increased latency and error rates for monitoring endpoints globally for 6 hours, 28 minutes. From preliminary analysis, the root cause of the issue is an overload of a monitoring API dependency that serves metric and monitored resource descriptors. Customer Impact: Requests against the monitoring API would have seen increased timeouts, errors, and latency. Cloud Monitoring dashboards would have failed to load due to timeout. Additional details: This service disruption was mitigated by increasing the resources available to the affected dependency, and we are confident that there will not be a recurrence.
6 Aug 2021	21:32 PDT	The issue with Cloud Monitoring has been resolved for all affected users as of Friday, 2021-08-06 20:53 US/Pacific. We thank you for your patience while we worked on resolving the issue.
6 Aug 2021	17:51 PDT	Summary: Elevated error rate across all monitoring API endpoints globally Description: Mitigation work is still underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Friday, 2021-08-06 21:29 US/Pacific. Diagnosis: None at this time. Workaround: None at this time.
6 Aug 2021	16:56 PDT	Summary: Elevated error rate across all monitoring API endpoints globally Description: Mitigation work is still underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Friday, 2021-08-06 17:59 US/Pacific. Diagnosis: None at this time. Workaround: None at this time.
6 Aug 2021	15:58 PDT	Summary: Elevated error rate across all monitoring API endpoints globally Description: Mitigation work is currently underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Friday, 2021-08-06 16:59 US/Pacific. Diagnosis: None at this time. Workaround: None at this time.
6 Aug 2021	15:25 PDT	Summary: Elevated error rate across all monitoring API endpoints globally Description: Mitigation work is currently underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Friday, 2021-08-06 15:59 US/Pacific. Diagnosis: None at this time. Workaround: None at this time.
6 Aug 2021	15:11 PDT	Summary: Elevated error rate across all monitoring API endpoints globally Description: We are experiencing an issue with Cloud Monitoring beginning at Friday, 2021-08-06 14:25 US/Pacific US/Pacific. Our engineering team continues to investigate the issue. We will provide the next update by Friday, 2021-08-06 15:40 US/Pacific US/Pacific. Diagnosis: None at this time. Workaround: None at this time.

All times are US/Pacific