Service Health
Incident affecting Vertex AI Training, Cloud Machine Learning
Vertex AI training jobs are experiencing issues where jobs may take longer than usual
Incident began at 2023-07-22 18:30 and ended at 2023-07-23 02:18 (all times are US/Pacific).
Previously affected location(s)
Taiwan (asia-east1)Hong Kong (asia-east2)Tokyo (asia-northeast1)Seoul (asia-northeast3)Mumbai (asia-south1)Singapore (asia-southeast1)Sydney (australia-southeast1)Belgium (europe-west1)London (europe-west2)Frankfurt (europe-west3)Netherlands (europe-west4)Zurich (europe-west6)Montréal (northamerica-northeast1)Toronto (northamerica-northeast2)Iowa (us-central1)South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)Los Angeles (us-west2)
Date | Time | Description | |
---|---|---|---|
| 23 Jul 2023 | 02:18 PDT | The issue with Vertex AI Training has been resolved for all affected users as of Sunday, 2023-07-23 02:09 US/Pacific. We thank you for your patience while we worked on resolving the issue. |
| 22 Jul 2023 | 20:02 PDT | Summary: Vertex AI training jobs are experiencing issues where jobs may take longer than usual Description: Mitigation work is currently underway by our engineering team. The mitigation is expected to complete by Sunday, 2023-07-23 02:00 US/Pacific. We will provide more information by Sunday, 2023-07-23 02:30 US/Pacific. Diagnosis: Vertex AI training jobs are experiencing issues where jobs may take longer than usual . Affected customers may see increase in errors "Container <container_name> was reset due to preemption" Workaround: None at this time. |
| 22 Jul 2023 | 19:14 PDT | Summary: Vertex AI training jobs are experiencing issues where jobs may take longer than usual Description: We are experiencing an issue with Vertex AI Training. Our engineering team investigated the issue, identified a mitigation, and are working to start the mitigation process. We will provide an update by Saturday, 2023-07-22 20:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Vertex AI training jobs are experiencing issues where jobs may take longer than usual . Affected customers may see increase in errors "Container <container_name> was reset due to preemption" Workaround: None at this time. |
- All times are US/Pacific