Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Share state among operators in Flink

I wonder if it is possible in Flink to share the state among operators.

Say, for instance, that I have partitioning by key on an operator and I need a piece of state of partition A inside partition C (for any reason) (fig 1.a), or I need the state of operator C in downstream operator F (fig 1.b).

enter image description here

I know it is possible to broadcast records to all partitions. So, if you include the internal state of an operator inside the records, you can share your internal state with downstream operators.
However, this could be an expensive operation instead of simply letting op1 specifically ask for op2 state.

Are the recent developments around queryable state moving towards this concept or they are meant only to let an external user query the internal state of the topology?

Thank you in advance for your insights

like image 506
affo Avatar asked Oct 18 '22 00:10

affo


1 Answers

In general, Flink's design does not allow to read from or write to state of other subtasks of the same or different operators. As you said, you can use broadcast to make state globally available. The queryable state features is intended for external user queries.

However, I heard of users who leveraged this features in an operator to fetch data from other operators of the same job. I don't know how well this works (stability and performance-wise). I would point you to the user mailing list for a more in-depth technical discussion if you would like to try this out.

like image 153
Fabian Hueske Avatar answered Oct 21 '22 05:10

Fabian Hueske