Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use transient, when not to in flink?

Tags:

apache-flink

in this code, should i use transient?

when can i use transient?

what is the difference ?

need your help

private              Map<String, HermesCustomConsumer> topicSourceMap                 = new ConcurrentHashMap();
private              Map<TopicAndPartition, Long>      currentOffsets                 = new HashMap<>();
private transient Map<TopicAndPartition, Long>         restoredState;
like image 624
liu xiuyuan Avatar asked Mar 18 '19 13:03

liu xiuyuan


People also ask

Why do we use transient?

transient is a variables modifier used in serialization. At the time of serialization, if we don't want to save value of a particular variable in a file, then we use transient keyword. When JVM comes across transient keyword, it ignores original value of the variable and save default value of that variable data type.

Can we use transient keyword with method?

It converts the byte sequence into original object data. During the serialization, when we do not want an object to be serialized we can use a transient keyword.

What is Flink operator State?

In operator state, the state is bound to an operator on one parallel substream. Keyed streams are created by defining keys for the elements of a stream. The keyed stream is read by the stateful operator and per key state is stored locally and can be accessed by the operator throughout the data streaming process.


1 Answers

TL;DR
If you use transient variable, you'd better instantiate it in open() method of operators which implemented Rich interface. Otherwise, declare the variable with an initial value at the same time.

The states you use here are called raw states managed by the user itself. Whether you should use transient modifier depending on serialization purpose. Before you submit the Flink job. The computation topology will be generated and distributed into Flink cluster. And operators including source and sink will instantiate with fields e.g, topicSourceMap in your code. Variables topicSourceMap and currentOffsets have been instantiated with constructor. While restoredState is a transient variable, thus no matter what initial value you assigned with, it will not be serialized and distributed into some task to execute. So you usually need to instanciate it in open() method of operator which implemented Rich interface. After this operator is deserialized in some task, open() method would be invoked into instantiate your own states.

like image 114
LeoZhang Avatar answered Sep 17 '22 21:09

LeoZhang