If there is a kafka topic with 10 partitions and we'd like to use flink to consume the topic. We want the system to allocate slots dynamically according to workload, which means if the workload is low, the flink job can use less slots(with less parallelism), and if the workload is high it can run with higher parallelism. Is there a good way to achieve this? It seems that the parallelism can be changed with stopping the job first. if so, does the pause period affect real-time feature of the application? Any other ideas to change the parallelism? Thank you very much.
There is a REST api call for modifying the parallelism of a running job, but currently the only way to redistribute the state is to create a savepoint and restart from it, so that's how rescaling works (at least for now).
If your application is using event time processing, then the results should be unaffected by the restart, but they will be delayed, of course, by the downtime.
Update: previously there was a CLI command to do the rescaling, but this was temporarily disabled in Flink 1.9.0. See FLINK-12312.
Is there a good way to achieve dynamic scaling?
As far as I know, the answer is NO for now. However, we can tell that this has been under consideration from the FLIP-6 Flink Development and Process Model.
Does the pause period affect real-time feature of the application?
Yes. The time cost will be in cancelling, restarting, reallocating resources and states, and so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With