How does Kafka Streams allocate partitions?

Tags:

I have a Kafka Streams application that is receiving data from topic-1 as KStream and topic-2 as KTable. Both topics have 4 partitions each. Let's say that I have 4 instances of the application running, then each instance will receive data from a single partition for topic-1. How about topic-2 which is received as KTable? Are all instances going to receive data from all 4 partitions in that case? If both the topics are keyed the same, then I guess Kafka Streams will ensure that the same partitions are allocated for an application. If topic-2 doesn't have any keys, but rather the application is going to infer that from the value itself, then that means that all the instances need to get all partitions from topic-2. How does Kafka Streams handle this situation?

Thank you!

764

asked Apr 27 '18 21:04

sobychacko

1 Answers

KTables are sharded according to the input partitions. Thus, similar to a KStream, each instance will get one topic-partition assigned and materialize this topic-partition as shard of the KTable. Kafka Streams make sure, that topic partitions of different topic are co-located, ie, one instance will get assigned topic-1 partition-0 and topic-2 partition-0 (and so forth).

If topic-2 has no key set, data will be randomly distributed in the topic. For this case, you can use a GlobalKTable instead. A GlobalKTable is a full replication of all partitions per instance. If you do a KStream-GlobalKTable-join, you can specify a "mapper" that extracts the join attribute from the table (ie, you can extract the join attribute from the value).

Note: a KStream-GlobalKTable join has different semantics than a KStream-KTable join. It is not time synchronized in contrast to the later, and thus, the join is non-deterministic by design with regard to GlobalKTable updates; i.e., there is no guarantee what KStream record will be the first to "see" a GlobalKTable updates and thus join with the updated GlobalKTable record.

147

answered Sep 28 '22 18:09

Matthias J. Sax

Related questions
                            
                                How to validate forms inside vuetify stepper using vee-validate
                            
                                get the height of horizontal scrollbar
                            
                                How to measure power efficiency increase or decrease for code change [closed]
                            
                                Let T = {<M> | M is a TM that accepts $w^R$ whenever it accepts w}. Show that T is undecidable
                            
                                Python - how to convert 1/0 to yes/no (in pandas.DataFrame)? [duplicate]
                            
                                ERROR: column "semester_id" is of type integer but expression is of type boolean
                            
                                Woocommerce get product attribute ID using name
                            
                                webpack not creating CSS file for SCSS
                            
                                python pandas read_excel: sep parameter available?
                            
                                Django queryset with variable value
                            
                                Pandas read_csv with different date parsers
                            
                                Getting the number of arguments to a function at compile-time using variadic templates with argument type check in c++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With