Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple SparkContext detected in the same JVM

according my last question I have to define the Multiple SparkContext for my unique JVM.

I did it in the next way (using Java):

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

After that I create the next source code:

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

and later in the code:

JavaSparkContext ctx = new JavaSparkContext(conf);
JavaRDD<Row> testRDD = ctx.parallelize(AllList);

After the code executing I got next error message:

16/01/19 15:21:08 WARN SparkContext: Multiple running SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:81)
test.MLlib.BinarryClassification.main(BinaryClassification.java:41)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$1.apply(SparkContext.scala:2083)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$1.apply(SparkContext.scala:2065)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2065)
    at org.apache.spark.SparkContext$.setActiveContext(SparkContext.scala:2151)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:2023)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at test.MLlib.BinarryClassification.main(BinaryClassification.java:105)

The numbers 41 and 105 are the lines, where both objects are defined in Java code. My question is, is it possible to execute multiple SparkContext on the same JVM and how to do it, if I already use the set-method ?

like image 408
Guforu Avatar asked Jan 19 '16 14:01

Guforu


People also ask

Can we have multiple SparkContext in single JVM?

Since the question talks about SparkSessions, it's important to point out that there can be multiple SparkSession s running but only a single SparkContext per JVM.

How many SparkContext can be active session?

Notice however that SparkSession encapsulates SparkContext and since by default we can have only 1 context per JVM, all the SparkSessions from above example are represented in the UI within a single one application - usually the first launched.

How many SparkContext can be created?

Main entry point for Spark functionality. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster. Only one SparkContext may be active per JVM. You must stop() the active SparkContext before creating a new one.

Should I use SparkSession or SparkContext?

Once the SparkSession is instantiated, we can configure Spark's run-time config properties. Spark 2.0. 0 onwards, it is better to use sparkSession as it provides access to all the spark Functionalities that sparkContext does. Also, it provides APIs to work on DataFrames and Datasets.


2 Answers

Rather than using SparkContext you should use builder method on SparkSession which more roubustly instantiates the spark and SQL context and ensures that there is not context conflict.

import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("demo").getOrCreate()
like image 79
Gaurang Shah Avatar answered Oct 22 '22 01:10

Gaurang Shah


Are you sure you need the JavaSparkContext as a separate context? The previous question that you refer to doesn't say so. If you already have a Spark Context you can create a new JavaSparkContext from it, rather than create a separate context:

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

//Create a Java Context which is the same as the scala one under the hood
JavaSparkContext.fromSparkContext(sc)
like image 23
mattinbits Avatar answered Oct 21 '22 23:10

mattinbits