Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Google Cloud Dataflow, Dataproc Metastore

Google Cloud Dataflow elevated errors starting new or querying existing dataflow jobs in us-west1, asia-east1, asia-northeast1, and europe-west1.

Incident began at 2021-08-25 01:37 and ended at 2021-08-25 04:14 (all times are US/Pacific).

Date Time Description
25 Aug 2021 11:18 PDT

We apologize for the inconvenience this service disruption may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support

(All Times US/Pacific)

Incident Start: 25 August 2021 01:37

Incident End: 25 August 2021 04:14

Duration: 2 hours, 37 minutes

Affected Services and Features:

  • Google Cloud Dataflow
  • Dataproc Metastore

Regions/Zones: us-west1, asia-east1, asia-northeast1, europe-west1

Description:

Google Cloud Dataflow experienced elevated errors starting new or querying existing dataflow jobs in us-west1, asia-east1, asia-northeast1, and europe-west1 for a duration of 2 hours and 37 minutes. From preliminary analysis, the root cause of the issue was a misconfiguration triggered by a rollout.

Customer Impact:

  • 500 errors when launching new dataflow jobs.
  • 500 errors querying existing dataflow jobs.
  • The majority of the impact for customers was in us-west1, with 33% of job creation and query traffic reporting errors.
  • Dataproc Metastore uses underlying Dataflow jobs for some features, and thus experienced elevated errors of up to 100% on the following API’s in us-west1 from 01:27 to 03:53; Restore, Import (from SQL, Avro), and Export (to Avro).
  • There was a slight re-occurance in europe-west1 between 06:06 and 08:02 with a peak error rate of 3.5%.
  • Existing jobs continued to progress without issue.
25 Aug 2021 04:43 PDT

The issue with Google Cloud Dataflow has been resolved for all affected projects as of Wednesday, 2021-08-25 04:42 US/Pacific.

We thank you for your patience while we worked on resolving the issue.

25 Aug 2021 04:30 PDT

Summary: Dataflow job querying and creation impacted in us-west1

Description: Mitigation work is currently underway by our engineering team.

The mitigation is expected to complete by Wednesday, 2021-08-25 06:00 US/Pacific.

We will provide more information by Wednesday, 2021-08-25 05:30 US/Pacific.

Diagnosis: We believe starting a Dataflow job or querying existing Dataflow jobs may fail for the customers. However, existing Dataflow jobs should continue to progress as usual.

Workaround: None at this time.

25 Aug 2021 03:39 PDT

Summary: Dataflow job querying and creation impacted in us-west1

Description: We are experiencing an issue with Google Cloud Dataflow in us-west1 beginning at Wednesday, 2021-08-25 01:37 US/Pacific.

Our engineering team continues to investigate the issue.

We will provide an update by Wednesday, 2021-08-25 04:30 US/Pacific with current details. We apologize to all who are affected by the disruption.

Diagnosis: We believe starting a Dataflow job or querying existing Dataflow jobs may fail for the customers. However, existing Dataflow jobs should continue to progress as usual.

Workaround: None at this time.