Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sending credentials to Google Dataflow jobs

What is the right way to pass credentials to Dataflow jobs?

Some of my Dataflow jobs need credentials to make REST calls and fetch/post processed data.

I am currently using environment variables to pass the credentials to the JVM, read them into a Serializable object and pass them on to the DoFn implementation's constructor. I am not sure this is the right approach as any class which is Serializable should not contain sensitive information.

Another way I thought of is to store the credential in GCS and retrieve them using service account key file, but was wondering why should my job execute this task of reading credentials from GCS.

like image 563
Krishna Chaitanya P Avatar asked Mar 28 '18 11:03

Krishna Chaitanya P


1 Answers

Google Cloud Dataflow does not have native support for passing or storing secured secrets. However you can use Cloud KMS and/or GCS as you propose to read a secret at runtime using your Dataflow service account credentials.

If you read the credential at runtime from a DoFn, you can use the DoFn.Setup lifecycle API to read the value once and cache it for the lifetime of the DoFn.

You can learn about various options for secret management in Google Cloud here: Secret management with Cloud KMS.

like image 165
Scott Wegner Avatar answered Nov 23 '22 14:11

Scott Wegner