Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Request insufficient authentication scopes when running Spark-Job on dataproc

I am trying to run the spark job on the google dataproc cluster as

 gcloud dataproc jobs submit hadoop --cluster <cluster-name> \
--jar file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
--class org.apache.hadoop.examples.WordCount \
--arg1 \
--arg2 \

But the Job throws error

 (gcloud.dataproc.jobs.submit.spark) PERMISSION_DENIED: Request had insufficient authentication scopes.

How do I add the auth scopes to run the JOB?

like image 907
Vishal Avatar asked Apr 12 '17 13:04

Vishal


People also ask

What are the minimum permissions needed for a service account used with Google Dataproc?

At a minimum, service accounts used with Cloud Dataproc need permissions to read and write to Google Cloud Storage, and to write to Google Cloud Logging.

How do I restart a Dataproc cluster?

Click the cluster name from the Dataproc Clusters page in the Google Cloud console, then click STOP to stop and START to start the cluster.


1 Answers

Usually if you're running into this error it's because of running gcloud from inside a GCE VM that's using VM-metadata controlled scopes, since otherwise gcloud installed on a local machine will typically already be using broad scopes to include all GCP operations.

For Dataproc access, when creating the VM from which you're running gcloud, you need to specify --scopes cloud-platform from the CLI, or if creating the VM from the Cloud Console UI, you should select "Allow full access to all Cloud APIs":

Cloud Console Create VM UI - Identity and API access

As another commenter mentioned above, nowadays you can also update scopes on existing GCE instances to add the CLOUD_PLATFORM scope.

like image 126
Dennis Huo Avatar answered Oct 02 '22 13:10

Dennis Huo