Does kafka connect start new connector and its tasks within kafka connect process? or a new JVM process will be forked.
If it starts plugin within kafka connect process, then I need set kafka connect JVM heap size via KAFKA_CONNECT_JVM_HEAP_OPT
(using confluent docker image). Then the problem is, if I start many tasks or many connectors, they will share the JVM heap, so it is hard to decide the heap size of kafka connect.
If for each connector, kafka connect starts them in a new JVM process, how can I set the heap size for them?
Kafka Connect has basic support for multi-tenancy. Specifically, you are able to bundle several connector instances within the same Connect worker.
Each Connect worker always maps to a single JVM instance. A request to start a new connector does not result into spawning a new JVM instance. But Connect workers with the same group.id
form a cluster of Connect workers. Then, connector tasks are distributed among the workers in the Connect cluster.
A Connect worker's heap size can be easily set using:
export KAFKA_HEAP_OPTS="-Xms256M -Xmx2G"
(this example uses the default values)
or, when a docker image is used, by setting:
-e CONNECT_KAFKA_HEAP_OPTS="-Xms256M -Xmx2G"
(again this example uses the default values)
Connect workers can be scaled horizontally. Adding more workers in a Connect cluster adds memory and computing resources to your deployment. If you need to apply a more specific and tight memory budget to your Connect deployment, you might chose to group specific connectors to each Connect cluster, or even in some cases deploy one connector instance per Connect cluster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With