Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set SPARK_LOCAL_DIRS parameter using spark-env.sh file

I am trying to change the location spark writes temporary files to. Everything I've found online says to set this by setting the SPARK_LOCAL_DIRS parameter in the spark-env.sh file, but I am not having any luck with the changes actually taking effect.

Here is what I've done:

  1. Created a 2-worker test cluster using Amazon EC2 instances. I'm using spark 2.2.0 and the R sparklyr package as a front end. The worker nodes are spun up using an auto scaling group.
  2. Created a directory to store temporary files in at /tmp/jaytest. There is one of these in each worker and one in the master.
  3. Puttied into the spark master machine and the two workers, navigated to home/ubuntu/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh, and modified the file to contain this line: SPARK_LOCAL_DIRS="/tmp/jaytest"

Permissions for each of the spark-env.sh files are -rwxr-xr-x, and for the jaytest folders are drwxrwxr-x.

As far as I can tell this is in line with all the advice I've read online. However, when I load some data into the cluster it still ends up in /tmp, rather than /tmp/jaytest.

I have also tried setting the spark.local.dir parameter to the same directory, but also no luck.

Can someone please advise on what I might be missing here?

Edit: I'm running this as a standalone cluster (as the answer below indicates that the correct parameter to set depends on the cluster type).

like image 518
jay Avatar asked Aug 29 '18 02:08

jay


1 Answers

As per the spark documentation it is clearly saying that if you have configured Yarn Cluster manager then it will be overwrite the spark-env.sh setting. Can you just check in Yarn-env or yarn-site file for the local dir folder setting.

"this will be overridden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager." source - https://spark.apache.org/docs/2.3.1/configuration.html

like image 121
Vijay L Avatar answered Sep 23 '22 07:09

Vijay L