Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use of __consumer_offsets and _schema topics in Kafka?

After setting up the Kafka Broker cluster and creating few topics, we found that the following two topics are automatically created by Kafka:

  1. __consumer_offsets
  2. _schema

What is the importance and use of these topics ?

like image 445
Anshul Patel Avatar asked Sep 16 '16 10:09

Anshul Patel


People also ask

What is the purpose of __ Consumer_offsets topic?

__consumer_offsets is used to store information about committed offsets for each topic:partition per group of consumers (groupID). It is compacted topic, so data will be periodically compressed and only latest offsets information available.

Can I delete __ Consumer_offsets?

__consumer_offsets is a kafka internal topic and it is not allowed to be deleted through delete topic command. It contains information about committed offsets for each topic:partition for each group of consumers (groupID). If you want to wipe it out entirely you have to delete the zookeeper dataDir location.

What is current offset and Log end offset in Kafka?

current-offset is the last committed offset of the consumer instance, log-end-offset is the highest offset of the partition (hence, summing this column gives you the total number of messages for the topic)

What are topics and partitions in Kafka?

Kafka's topics are divided into several partitions. While the topic is a logical concept in Kafka, a partition is the smallest storage unit that holds a subset of records owned by a topic . Each partition is a single log file where records are written to it in an append-only fashion.


2 Answers

__consumer_offsets is used to store information about committed offsets for each topic:partition per group of consumers (groupID). It is compacted topic, so data will be periodically compressed and only latest offsets information available.

_schema - is not a default kafka topic (at least at kafka 8,9). It is added by Confluent. See more: Confluent Schema Registry - github.com/confluentinc/schema-registry (thanks @serejja)

like image 129
Natalia Avatar answered Sep 28 '22 04:09

Natalia


__consumer_offsets: Every consumer group maintains its offset per topic partitions. Since v0.9 the information of committed offsets for every consumer group is stored in this internal topic (prior to v0.9 this information was stored on Zookeeper). When the offset manager receives an OffsetCommitRequest, it appends the request to a special compacted Kafka topic named __consumer_offsets. Finally, the offset manager will send a successful offset commit response to the consumer, only when all the replicas of the offsets topic receive the offsets.

_schemas: This is an internal topic used by the Schema Registry which is a distributed storage layer for Avro schemas. All the information which is relevant to schema, subject (with its corresponding version), metadata and compatibility configuration is appended to this topic. The schema registry in turn, produces (e.g. when a new schema is registered under a subject) and consumes data from this topic.

like image 43
Giorgos Myrianthous Avatar answered Sep 28 '22 03:09

Giorgos Myrianthous