In spark, java serialization is the default, if kryo is that efficient then why it is not set as default. Is there some cons using kryo or in what scenarios we should use kryo or java serialization?
Serialization allows us to transfer objects through a network by converting it into a byte stream. It also helps in preserving the state of the object. Deserialization requires less time to create an object than an actual object created from a class. hence serialization saves time.
Kryo is a fast and efficient binary object graph serialization framework for Java. The goals of the project are high speed, low size, and an easy to use API. The project is useful any time objects need to be persisted, whether to a file, database, or over the network.
KryoSerializer") . This setting configures the serializer used for not only shuffling data between worker nodes but also when serializing RDDs to disk. The only reason Kryo is not the default is because of the custom registration requirement, but we recommend trying it in any network-intensive application.
Java serialization is slow because it uses reflection. JDK serialization does a lot of backward compatibility checking and strict type checking. But java serialization garneted 100% same object after deserialization in most of the case.
Here is comment from documentation:
Kryo is significantly faster and more compact than Java serialization (often as much as 10x), but does not support all Serializable types and requires you to register the classes you’ll use in the program in advance for best performance.
So it is not used by default because:
java.io.Serializable
is supported out of the box - if you have custom class that extends Serializable
it still cannot be serialized with Kryo, unless registered.Note according to documentation:
Spark automatically includes Kryo serializers for the many commonly-used core Scala classes covered in the AllScalaRegistrar from the Twitter chill library.
Kryo Pros : Memory consumption is low
The time kryo didnt work for me as is was when I was dealing with google protobufs. Thats when I had to first register the proto class
https://mvnrepository.com/artifact/de.javakaffee/kryo-serializers/0.45
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With