Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pyspark, initializing spark programmatically : IllegalArgumentException: Missing application resource

Tags:

python

pyspark

When creating a spark context in Python, I get the following error.

 app_name="my_app"
 master="local[*]"
 sc = SparkContext(appName=app_name, master=master)

Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:276)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
at org.apache.spark.launcher.Main.main(Main.java:86)

....

pyspark.zip/pyspark/java_gateway.py", line 94, in launch_gateway
raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

The spark launcher seems to be failing somehow.

like image 930
Sasinda Rukshan Avatar asked Oct 04 '16 20:10

Sasinda Rukshan


1 Answers

This was happening due to preexisting env variables, that conflicted. I deleted them in the python program and it works smoothly now.

ex:

import  os
#check if pyspark env vars are set and then reset to required or delete.   
del os.environ['PYSPARK_SUBMIT_ARGS']

The correct solution is to delete it in .bashrc or .zshrc or whatever the env initialization scripts that would be initializing it. But could not find it in .bash_profile(mac) (.bashrc or /etc/environment.conf). Will update answer if the location is found

like image 92
Sasinda Rukshan Avatar answered Oct 21 '22 19:10

Sasinda Rukshan