Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flink RichSinkFunction constructor VS open()

Let's say I need to implemnt a custom sink using RichSinkFunction, and I need some variables like DBConnection in the sink. Where should I initialize the DBConnection? I see most of the articles init the DBConnection in the open() method, why not in the constructor?

A folow up questions is what kind of variables should be inited in constructor and what should be init in open()?

like image 717
Jiawei Wu Avatar asked Nov 19 '25 05:11

Jiawei Wu


1 Answers

The constructor of a RichFunction is only invoked on client side. If something needs to be actually performed on the cluster, it should be done in open.

open also needs to be used if you want to access parameters to your Flink job or RuntimeContext (for state, counters, etc.). When you use open, you also want to use close in symmetric fashion.

So to answer your question: your DBConnection should be initialized in open only. In constructor, you usually just store job-constant parameters in fields, such as how to access the key of your records if your sink can be reused across multiple projects with different data structures.

like image 108
Arvid Heise Avatar answered Nov 21 '25 08:11

Arvid Heise