Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kryo.readObject cause NullPointerException with ArrayList

Tags:

java

kryo

I get a NullPointerException when I unserialize an ArrayList object using kryo.

Caused by: java.lang.NullPointerException   
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:215)   
at java.util.ArrayList.ensureCapacity(ArrayList.java:199)   
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:96)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22)    at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)     
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)

I can see that StdInstantiatorStrategy creates an ArrayList without calling its constructor leaving one of the fields uninitialized causing the exception.

The documentation says that the no argument constructor should be called first and if none is available, the StdInstantiatorStrategy should be used to do field by field initialization.

What am I doing wrong?

like image 484
cquezel Avatar asked May 30 '14 20:05

cquezel


People also ask

How do I stop null pointer exceptions?

To avoid the NullPointerException, we must ensure that all the objects are initialized properly, before you use them. When we declare a reference variable, we must verify that object is not null, before we request a method or a field from the objects.

Is the NullPointerException a compile exception?

NullPointerException is a RuntimeException . Runtime exceptions are critical and cannot be caught at compile time. They crash the program at run time if they are not handled properly.

Can we extend NullPointerException?

lang. NullPointerException is an unchecked exception and extends RuntimeException class. Hence there is no compulsion for the programmer to catch it.


1 Answers

I meet the same problem and finally solve it. In my case, I use protobuf object in Spark job. Spark kryo serializer couldn't serialize/deserialize protobuf object very well. We can use two methods to solve this problem.

  1. Use protobuf default serialize/deserialize method instead of kryo serialize method. For example, you can convert your Spark rdd[YourProtobufObject] to rdd[ByteString], use pb.toByteString() do the serialize and use .parseFrom(xxByteString) do the deserialize. Actually, this method is not elegant, but it works.
  2. Register your own protobuf class to kryo. The details as follows.
  • First, add config to SparkConf. For example
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
.set("spark.kryo.registrator","your.own.registrator.implement.MyKryoRegistrator")
  • Second, create your own registrator implement. You can use Twitter opensource Chill project ProtobufSerializer dependency or mvnrepository, and use ProtobufSerializer directly. The maven dependency looks like this.

     <dependency>
         <groupId>com.twitter</groupId>
         <artifactId>chill_2.11</artifactId>
         <version>0.9.3</version>
     </dependency>
     <dependency>
         <groupId>com.twitter</groupId>
         <artifactId>chill-protobuf</artifactId>
         <version>0.9.3</version>
         <exclusions>
             <exclusion>
                 <groupId>com.twitter</groupId>
                 <artifactId>chill-java</artifactId>
             </exclusion>
         </exclusions>
     </dependency>
    

create your own kryo registrator implement named MyKryoRegistrator

class MyKryoRegistrator extends KryoRegistrator {
  override def registerClasses(kryo: Kryo): Unit = {
    kryo.register(classOf[YourProtobufObject], new ProtobufSerializer())
  }
}
  • Third, run again your Spark job, it will be ok.
like image 142
Armstrongya Avatar answered Sep 19 '22 19:09

Armstrongya