Request insufficient authentication scopes when running Spark-Job on dataproc

Tags:

I am trying to run the spark job on the google dataproc cluster as

 gcloud dataproc jobs submit hadoop --cluster <cluster-name> \
--jar file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
--class org.apache.hadoop.examples.WordCount \
--arg1 \
--arg2 \

But the Job throws error

 (gcloud.dataproc.jobs.submit.spark) PERMISSION_DENIED: Request had insufficient authentication scopes.

How do I add the auth scopes to run the JOB?

907

asked Apr 12 '17 13:04

Vishal

1 Answers

Usually if you're running into this error it's because of running gcloud from inside a GCE VM that's using VM-metadata controlled scopes, since otherwise gcloud installed on a local machine will typically already be using broad scopes to include all GCP operations.

For Dataproc access, when creating the VM from which you're running gcloud, you need to specify --scopes cloud-platform from the CLI, or if creating the VM from the Cloud Console UI, you should select "Allow full access to all Cloud APIs":

Cloud Console Create VM UI - Identity and API access

As another commenter mentioned above, nowadays you can also update scopes on existing GCE instances to add the CLOUD_PLATFORM scope.

126

answered Oct 02 '22 13:10

Dennis Huo

Related questions
                            
                                Spark - Datediff for months?
                            
                                Is querying against a Spark DataFrame based on CSV faster than one based on Parquet?
                            
                                sparksql drop hive table
                            
                                Connect sparklyr to remote spark connection
                            
                                How to save Spark RDD to local filesystem
                            
                                Will Spark SQL completely replace Apache Impala or Apache Hive?
                            
                                Filter dataframe by value NOT present in column of other dataframe [duplicate]
                            
                                Pyspark read multiple csv files into a dataframe (OR RDD?)
                            
                                how to handle millions of smaller s3 files with apache spark
                            
                                pyspark merge two rdd together
                            
                                How to make onehotencoder in Spark to work like onehotencoder in Pandas?
                            
                                How long does RDD remain in memory?
                            
                                Pyspark ML - How to save pipeline and RandomForestClassificationModel
                            
                                Efficient string suffix detection
                            
                                Spark / Scala: Passing RDD to Function
                            
                                Why do I have to explicitly tell Spark what to cache?
                            
                                How to apply a function to a column of a Spark DataFrame?
                            
                                How do I convert column of unix epoch to Date in Apache spark DataFrame using Java?
                            
                                Query in Spark SQL inside an array
                            
                                Spark list all cached RDD names and unpersist

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Request insufficient authentication scopes when running Spark-Job on dataproc

Tags:

google-cloud-platform

apache-spark

google-cloud-dataproc

Vishal

People also ask

1 Answers

Dennis Huo

Recent Activity

Donate For Us