Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dataflow setting Controller Service Account

I try to set up controller service account for Dataflow. In my dataflow options I have:

options.setGcpCredential(GoogleCredentials.fromStream(
                         new FileInputStream("key.json")).createScoped(someArrays)); 
options.setServiceAccount("[email protected]");

But I'm getting:

WARNING: Request failed with code 403, performed 0 retries due to IOExceptions,         
         performed 0 retries due to unsuccessful status codes, HTTP framework says 
         request can be retried, (caller responsible for retrying): 
         https://dataflow.googleapis.com/v1b3/projects/MYPROJECT/locations/MYLOCATION/jobs
Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow 
         job: (CODE): Current user cannot act as 
         service account "[email protected]. 
         Causes: (CODE): Current user cannot act as 
         service account "[email protected].
    at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:791)
    at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)

...

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
{
  "code" : 403,
  "errors" : [ {
    "domain" : "global",
    "message" : "(CODE): Current user cannot act as service account 
                 [email protected]. Causes: (CODE): Current user
                 cannot act as service account [email protected].",
    "reason" : "forbidden"
  } ],
  "message" : "(CODE): Current user cannot act as service account 
               [email protected]. Causes: (CODE): Current user 
               cannot act as service account [email protected].",
  "status" : "PERMISSION_DENIED"
}

Am I missing some Roles or permissions?

like image 205
Magda Kiwi Avatar asked Dec 12 '18 09:12

Magda Kiwi


People also ask

Which service account does Dataflow use?

The Dataflow service uses the Dataflow service account as part of the job creation request (for example, to check project quota and to create worker instances on your behalf), and during job execution to manage the job. The worker service account.

What is a Dataflow worker?

The Dataflow Worker role ( roles/dataflow. worker ) provides the permissions necessary for a Compute Engine service account to run work units for an Apache Beam pipeline. The Dataflow Worker role must be assigned to a service account that is able to request and update work from the Dataflow service.


2 Answers

Maybe someone is going to find it helpful:

  • For controller it was: Dataflow Worker and Storage Object Admin (that was found in Google's documentation).

  • For executor it was: Service Account User.

like image 76
Magda Kiwi Avatar answered Oct 16 '22 19:10

Magda Kiwi


I just hit this problem again so posting my solution up here as I fully expect I'll get bitten by this again at some point.

I was getting error:

Error: googleapi: Error 403: (a00eba23d59c1fa3): Current user cannot act as service account [email protected]. Causes: (a00eba23d59c15ac): Current user cannot act as service account [email protected]., forbidden

I was deploying the dataflow job, via terraform, using a different service account, [email protected]

The solution was to grant that service account the roles/iam.serviceAccountUser role:

gcloud projects add-iam-policy-binding myproject \
    --member=serviceAccount:[email protected] \
    --role=roles/iam.serviceAccountUser

For those that prefer custom IAM roles over predefined IAM roles the specific permission that was missing was iam.serviceAccounts.actAs.

like image 5
jamiet Avatar answered Oct 16 '22 19:10

jamiet