Is there the possibility to run the Spark standalone cluster locally on just one machine (which is basically different from just developing jobs locally (i.e., <code>local[*]</code>))?. So far I am running 2 different VMs to build a cluster, what if I could run a standalone cluster on the very same machine, having for instance three different JVMs running? Could something like having multiple loopback addresses do the trick?

yes you can do it, launch one master and one worker node and you are good to go launch master <pre class="prettyprint"><code>./sbin/start-master.sh </code></pre> launch worker <pre class="prettyprint"><code>./bin/spark-class org.apache.spark.deploy.worker.Worker spark://localhost:7077 -c 1 -m 512M </code></pre> run SparkPi example <pre class="prettyprint"><code>./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://localhost:7077 lib/spark-examples-1.2.1-hadoop2.4.0.jar </code></pre> Apache Spark Standalone Mode Documentation

Spark - How to run a standalone cluster locally

1 Answers

yes you can do it, launch one master and one worker node and you are good to go

launch master

./sbin/start-master.sh

launch worker

./bin/spark-class org.apache.spark.deploy.worker.Worker  spark://localhost:7077 -c 1 -m 512M

run SparkPi example

./bin/spark-submit  --class org.apache.spark.examples.SparkPi   --master spark://localhost:7077  lib/spark-examples-1.2.1-hadoop2.4.0.jar

Apache Spark Standalone Mode Documentation

answered Sep 28 '22 10:09

banjara

Related questions
                            
                                spark-submit, how to specify log4j.properties
                            
                                issue Running Spark Job on Yarn Cluster
                            
                                Does Spark know the partitioning key of a DataFrame?
                            
                                How to get the number of workers(executors) in PySpark?
                            
                                How to read a nested collection in Spark
                            
                                Initialize an RDD to empty
                            
                                Spark Build Custom Column Function, user defined function
                            
                                Why do we need to add "fork in run := true" when running Spark SBT application?
                            
                                filter spark dataframe with row field that is an array of strings
                            
                                Spark Data Frame Random Splitting
                            
                                Save a large Spark Dataframe as a single json file in S3
                            
                                Exception while deleting Spark temp dir in Windows 7 64 bit
                            
                                PySpark - get row number for each row in a group
                            
                                How to pass environment variables to spark driver in cluster mode with spark-submit
                            
                                Apply a function to a single column of a csv in Spark
                            
                                Pyspark - converting json string to DataFrame
                            
                                Partitioning a large skewed dataset in S3 with Spark's partitionBy method
                            
                                error: not found: value StructType/StructField/StringType
                            
                                How to calculate the best numberOfPartitions for coalesce?
                            
                                NoClassDefFoundError: org/apache/hadoop/fs/StreamCapabilities while reading s3 Data with spark

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark - How to run a standalone cluster locally

Tags:

apache-spark

cluster-computing

luke

People also ask

1 Answers

banjara

Recent Activity

Donate For Us