Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark-shell with 'yarn-client' tries to load config from wrong location

I'm trying to launch bin/spark-shell and bin/pyspark from laptop, connecting to Yarn cluster in yarn-client mode, and I get the same error

WARN ScriptBasedMapping: Exception running
/etc/hadoop/conf.cloudera.yarn1/topology.py 10.0.240.71
java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn1/topology.py" 
(in directory "/Users/eugenezhulenev/projects/cloudera/spark"): error=2, 
No such file or directory

Spark is trying to run /etc/hadoop/conf.cloudera.yarn1/topology.py on my laptop, but not on worker node in Yarn.

This problem appeared after update from Spark 1.2.0 to 1.3.0 (CDH 5.4.2)

like image 379
Eugene Zhulenev Avatar asked Oct 20 '22 07:10

Eugene Zhulenev


1 Answers

The following steps is a temporarily work-around for this issue on CDH 5.4.4

cd ~
mkdir -p test-spark/
cd test-spark/

Then copy all files from /etc/hadoop/conf.clouder.yarn1 from one worker node to the above (local) directory. And then run spark-shell from ~/test-spark/

like image 66
Zouzias Avatar answered Oct 21 '22 22:10

Zouzias