Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cancelling jobs without dataloss on DataFlow

I'm trying to find a way gracefully end my jobs, so as not to lose any data, streaming from PubSub and writing to BigQuery.

A possible approach I can envision is to have the job stop pulling new data and then run until it has processed everything, but I don't know if/how this is possible to implement.

like image 322
MffnMn Avatar asked Feb 05 '16 11:02

MffnMn


2 Answers

It appears this feature was added in the latest release.

All you have to do now is select the drain option when cancelling a job.

Thanks.

like image 137
MffnMn Avatar answered Oct 20 '22 06:10

MffnMn


I believe this would be difficult (if not impossible) to do on your own. We (Google Cloud Dataflow team) are aware of this need and are working on addressing it with a new feature in the coming months.

like image 40
Eric Anderson Avatar answered Oct 20 '22 05:10

Eric Anderson