Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Vertex AI Training, Cloud Machine Learning

Vertex AI training jobs are experiencing issues where jobs may take longer than usual 

Incident began at 2023-07-22 18:30 and ended at 2023-07-23 02:18 (all times are US/Pacific).

Previously affected location(s)

Taiwan (asia-east1)Hong Kong (asia-east2)Tokyo (asia-northeast1)Seoul (asia-northeast3)Mumbai (asia-south1)Singapore (asia-southeast1)Sydney (australia-southeast1)Belgium (europe-west1)London (europe-west2)Frankfurt (europe-west3)Netherlands (europe-west4)Zurich (europe-west6)Montréal (northamerica-northeast1)Toronto (northamerica-northeast2)Iowa (us-central1)South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)Los Angeles (us-west2)

Date Time Description
23 Jul 2023 02:18 PDT

The issue with Vertex AI Training has been resolved for all affected users as of Sunday, 2023-07-23 02:09 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

22 Jul 2023 20:02 PDT

Summary: Vertex AI training jobs are experiencing issues where jobs may take longer than usual 

Description: Mitigation work is currently underway by our engineering team.

The mitigation is expected to complete by Sunday, 2023-07-23 02:00 US/Pacific.

We will provide more information by Sunday, 2023-07-23 02:30 US/Pacific.

Diagnosis: Vertex AI training jobs are experiencing issues where jobs may take longer than usual . Affected customers may see increase in errors "Container <container_name> was reset due to preemption"

Workaround: None at this time.

22 Jul 2023 19:14 PDT

Summary: Vertex AI training jobs are experiencing issues where jobs may take longer than usual 

Description: We are experiencing an issue with Vertex AI Training. 

Our engineering team investigated the issue, identified a mitigation, and are working to start the mitigation process. 

We will provide an update by Saturday, 2023-07-22 20:30 US/Pacific with current details. 

We apologize to all who are affected by the disruption.

Diagnosis: Vertex AI training jobs are experiencing issues where jobs may take longer than usual . Affected customers may see increase in errors "Container <container_name> was reset due to preemption"

Workaround: None at this time.