Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the pros and cons of java serialization vs kryo serialization?

In spark, java serialization is the default, if kryo is that efficient then why it is not set as default. Is there some cons using kryo or in what scenarios we should use kryo or java serialization?

like image 738
supernatural Avatar asked Nov 20 '19 04:11

supernatural


People also ask

What is the advantage of serialization in Java?

Serialization allows us to transfer objects through a network by converting it into a byte stream. It also helps in preserving the state of the object. Deserialization requires less time to create an object than an actual object created from a class. hence serialization saves time.

Why is KRYO serialized?

Kryo is a fast and efficient binary object graph serialization framework for Java. The goals of the project are high speed, low size, and an easy to use API. The project is useful any time objects need to be persisted, whether to a file, database, or over the network.

What is KRYO serialization in spark?

KryoSerializer") . This setting configures the serializer used for not only shuffling data between worker nodes but also when serializing RDDs to disk. The only reason Kryo is not the default is because of the custom registration requirement, but we recommend trying it in any network-intensive application.

Is Java serialization slow?

Java serialization is slow because it uses reflection. JDK serialization does a lot of backward compatibility checking and strict type checking. But java serialization garneted 100% same object after deserialization in most of the case.


2 Answers

Here is comment from documentation:

Kryo is significantly faster and more compact than Java serialization (often as much as 10x), but does not support all Serializable types and requires you to register the classes you’ll use in the program in advance for best performance.

So it is not used by default because:

  1. Not every java.io.Serializable is supported out of the box - if you have custom class that extends Serializable it still cannot be serialized with Kryo, unless registered.
  2. One needs to register custom classes.

Note according to documentation:

Spark automatically includes Kryo serializers for the many commonly-used core Scala classes covered in the AllScalaRegistrar from the Twitter chill library.

like image 101
Vladislav Varslavans Avatar answered Oct 19 '22 11:10

Vladislav Varslavans


Kryo Pros : Memory consumption is low

The time kryo didnt work for me as is was when I was dealing with google protobufs. Thats when I had to first register the proto class

https://mvnrepository.com/artifact/de.javakaffee/kryo-serializers/0.45

like image 24
Pranav Sawant Avatar answered Oct 19 '22 10:10

Pranav Sawant