Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confluent Schema Registry Cluster Mode

I am using Kafka Connect from Confluent to consume Kafka stream and write to HDFS in parquet format. I am using Schema Registry service in 1 node and it is running fine. Now I want to distribute Schema Registry to cluster mode to handle fail over. Any link or snippet on how to achieve that will be very useful.

like image 997
vijay kumar vijju Avatar asked Aug 26 '16 09:08

vijay kumar vijju


3 Answers

It is hard to find, but we covered this architecture in our documentation: http://docs.confluent.io/3.0.0/schema-registry/docs/deployment.html#multi-dc-setup

To quote from the docs a bit (although you should read the docs, lots of good architecture advice and a recovery runbook are included):

Assuming you have Schema Registry running, here are the recommended steps to add Schema Registry instances in a new “slave” datacenter (call it DC B):

In DC B, make sure Kafka has unclean.leader.election.enable set to false. In Kafka in DC B, create the _schemas topic. It should have 1 partition, kafkastore.topic.replication.factor of 3, and min.insync.replicas at least 2. In DC B, run MirrorMaker with Kafka in the “master” datacenter as the source and Kafka in DC B as the target. In the Schema Registry config files in DC B, set kafkastore.connection.url and schema.registry.zk.namespace to match the instances already running, and set master.eligibility to false. Start your new Schema Registry instances with these configs.

like image 161
Gwen Shapira Avatar answered Nov 07 '22 21:11

Gwen Shapira


I used confluent schema-registry docker image to form the cluster.

docker run --restart always -d -p 8081:8081 --name=schema-registry-1 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-1 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

docker run --restart always -d -p 8081:8081 --name=schema-registry-2 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-2 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

docker run --restart always -d -p 8081:8081 --name=schema-registry-3 -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=ip1:2181,ip2:2181,ip3:2181 -e SCHEMA_REGISTRY_HOST_NAME=schema-registry-3 -e SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081 -e SCHEMA_REGISTRY_DEBUG=true confluentinc/cp-schema-registry:5.2.1-1

Once this is up and running I verified if schema-registry cluster is formed and if its leader election is successful, by checking the zookeeper contents.

$ docker exec -it zookeeper bash
# /usr/bin/zookeeper-shell localhost:2181
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[schema_registry, cluster, controller, brokers, zookeeper, admin, isr_change_notification, log_dir_event_notification, controller_epoch, kafka-manager, CruiseControlBrokerList, consumers, latest_producer_id_block, config]
[zk: localhost:2181(CONNECTED) 1] ls /schema_registry
[schema_registry_master, schema_id_counter]
[zk: localhost:2181(CONNECTED) 4] get /schema_registry/schema_registry_master
{"host":"schema-registry-1","port":8081,"master_eligibility":true,"scheme":"http","version":1}
#

Hope this helps.

like image 25
mchawre Avatar answered Nov 07 '22 21:11

mchawre


You just need to put this in the connect-avro-distributed.properties to use multi schema registry:

key.converter.schema.registry.url=http://node1:8081,http://node2:8081
value.converter.schema.registry.url=http://node1:8081,http://node2:8081

Hope this is useful for you.

like image 33
Feiran Avatar answered Nov 07 '22 22:11

Feiran