Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set and get static variables from spark?

I have a class as this:

public class Test {
    private static String name;

    public static String getName() {
        return name;
    }

    public static void setName(String name) {
        Test.name = name;
    }

    public static void print() {
        System.out.println(name);
    }

}

Inside my Spark driver, I'm setting the name like this and calling the print() command:

public final class TestDriver{

    public static void main(String[] args) throws Exception {
        SparkConf sparkConf = new SparkConf().setAppName("TestApp");
        // ...
        // ...
        Test.setName("TestName")
        Test.print();
        // ...
    }
}

However, I'm getting a NullPointerException. How do I pass a value to the global variable and use it?

like image 638
diplomaticguru Avatar asked Apr 16 '15 20:04

diplomaticguru


People also ask

How do you set a static variable?

Static variable If a variable is static , the variable is assigned memory once and all objects of the class access the same variable. A static variable can be created by adding the static keyword before the variable during declaration.

How do I create a static variable in Scala?

There are no static variables in Scala. Fields are variables that belong to an object. The fields are accessible from inside every method in the object. Fields can also be accessible outside the object, depending on what access modifiers the field is declared with.

What is static variable in Verilog?

SYSTEM VERILOG STATIC AND AUTOMATIC LIFETIME OF VARIABLE AND METHODS. Static: For a variable static lifetime is, its memory never de-allocated until simulation ends.

What is flutter static?

"static" means a member is available on the class itself instead of on instances of the class. That's all it means, and it isn't used for anything else. static modifies members.


2 Answers

Ok, there is basically 2 ways to take a value known to the master to the executors:

  1. Put the value inside a closure to be serialized to the executors to perform a task. This is the most common one and very simple/elegant. Sample and doc here.
  2. Create a broadcast variable with the data. This is good for immutable data of a big size, so you want to guarantee it is send only once. Also good if the same data is used over and over. Sample and doc here.

No need to use static variables in either case. But, if you DO want to have static values available on your executor VMs, you need to do one of these:

  1. If the values are fixed or the configuration is available on the executor nodes (lives inside the jar, etc), then you can have a lazy val, guaranteeing initialization only once.
  2. You can call mapPartitions() with code that uses one of the 2 options above, then store the values on your static variable/object. mapPartitions is guaranteed to run only once for each partition (much better than once per line) and is good for this kind of thing (initializing DB connections, etc).

Hope this helps!

P.S: As for you exception: I just don't see it on that code sample, my bet is that it is occurring elsewhere.


Edit for extra clarification: The lazy val solution is simply Scala, no Spark involved...

object MyStaticObject
{
  lazy val MyStaticValue = {
     // Call a database, read a file included in the Jar, do expensive initialization computation, etc
     4
  }
} 

Since each Executor corresponds to a JVM, once the classes are loaded MyStaticObject will be initialized. The lazy keyword guarantees that the MyStaticValue variable will only be initialized the first time it is actually requested, and hold its value ever since.

like image 113
Daniel Langdon Avatar answered Sep 24 '22 23:09

Daniel Langdon


I would like to add one more point into DanielL's Answer

When declare a variable with static keyword the JVM loads it during the class loading so if you create a jar and set initial values of static fields in a Java /scala class are stored in the jar, workers can use it directly. However if you change the value of a static field in the driver program, workers can only see the initial value assigned into Jar and your changed value will not reflect , so you need to copy again new jar or need to copy class manually into all executors .

like image 24
vaquar khan Avatar answered Sep 20 '22 23:09

vaquar khan