Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set kafka-connect connector and task's JVM heap size?

Does kafka connect start new connector and its tasks within kafka connect process? or a new JVM process will be forked.

If it starts plugin within kafka connect process, then I need set kafka connect JVM heap size via KAFKA_CONNECT_JVM_HEAP_OPT (using confluent docker image). Then the problem is, if I start many tasks or many connectors, they will share the JVM heap, so it is hard to decide the heap size of kafka connect.

If for each connector, kafka connect starts them in a new JVM process, how can I set the heap size for them?

like image 871
Xiang Zhang Avatar asked Mar 06 '23 02:03

Xiang Zhang


1 Answers

Kafka Connect has basic support for multi-tenancy. Specifically, you are able to bundle several connector instances within the same Connect worker.

Each Connect worker always maps to a single JVM instance. A request to start a new connector does not result into spawning a new JVM instance. But Connect workers with the same group.id form a cluster of Connect workers. Then, connector tasks are distributed among the workers in the Connect cluster.

A Connect worker's heap size can be easily set using:

export KAFKA_HEAP_OPTS="-Xms256M -Xmx2G" (this example uses the default values)

or, when a docker image is used, by setting:

-e CONNECT_KAFKA_HEAP_OPTS="-Xms256M -Xmx2G" (again this example uses the default values)

Connect workers can be scaled horizontally. Adding more workers in a Connect cluster adds memory and computing resources to your deployment. If you need to apply a more specific and tight memory budget to your Connect deployment, you might chose to group specific connectors to each Connect cluster, or even in some cases deploy one connector instance per Connect cluster.

like image 60
Konstantine Karantasis Avatar answered May 09 '23 06:05

Konstantine Karantasis