Any idea for dynamic scaling flink job?

Question

If there is a kafka topic with 10 partitions and we'd like to use flink to consume the topic. We want the system to allocate slots dynamically according to workload, which means if the workload is low, the flink job can use less slots(with less parallelism), and if the workload is high it can run with higher parallelism. Is there a good way to achieve this? It seems that the parallelism can be changed with stopping the job first. if so, does the pause period affect real-time feature of the application? Any other ideas to change the parallelism? Thank you very much.

David Anderson · Accepted Answer

There is a REST api call for modifying the parallelism of a running job, but currently the only way to redistribute the state is to create a savepoint and restart from it, so that's how rescaling works (at least for now).

If your application is using event time processing, then the results should be unaffected by the restart, but they will be delayed, of course, by the downtime.

Update: previously there was a CLI command to do the rescaling, but this was temporarily disabled in Flink 1.9.0. See FLINK-12312.

Jiayi Liao · Answer

Is there a good way to achieve dynamic scaling?

As far as I know, the answer is NO for now. However, we can tell that this has been under consideration from the FLIP-6 Flink Development and Process Model.

Does the pause period affect real-time feature of the application?

Yes. The time cost will be in cancelling, restarting, reallocating resources and states, and so on.

Any idea for dynamic scaling flink job?

Tags:

apache-flink

snakie yu

2 Answers

David Anderson

Jiayi Liao

Recent Activity

Donate For Us

Any idea for dynamic scaling flink job?

Tags:

apache-flink

snakie yu

2 Answers

David Anderson

Jiayi Liao

Related questions

Recent Activity

Donate For Us