Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark - How to run a standalone cluster locally

Is there the possibility to run the Spark standalone cluster locally on just one machine (which is basically different from just developing jobs locally (i.e., local[*]))?.

So far I am running 2 different VMs to build a cluster, what if I could run a standalone cluster on the very same machine, having for instance three different JVMs running?

Could something like having multiple loopback addresses do the trick?

like image 632
luke Avatar asked May 06 '15 10:05

luke


People also ask

How do I make a Spark standalone cluster?

Installing Spark Standalone to a Cluster To install Spark Standalone mode, you simply place a compiled version of Spark on each node on the cluster. You can obtain pre-built versions of Spark with each release or build it yourself.

Can Spark be run locally?

It's easy to run locally on one machine — all you need is to have java installed on your system PATH , or the JAVA_HOME environment variable pointing to a Java installation. Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.


1 Answers

yes you can do it, launch one master and one worker node and you are good to go

launch master

./sbin/start-master.sh

launch worker

./bin/spark-class org.apache.spark.deploy.worker.Worker  spark://localhost:7077 -c 1 -m 512M

run SparkPi example

./bin/spark-submit  --class org.apache.spark.examples.SparkPi   --master spark://localhost:7077  lib/spark-examples-1.2.1-hadoop2.4.0.jar 

Apache Spark Standalone Mode Documentation

like image 74
banjara Avatar answered Sep 28 '22 10:09

banjara