Rabbitmq cluster setup in Kubernetes

Question

I successfully setup rabbitmq cluster using docker compose to understand the clustering concept. It worked fine below is docker compose file:

rabbit1:
  image: ipf-queue-node
  hostname: rabbit1
  cap_add:
    - ALL
    - NET_ADMIN
    - SYS_ADMIN
  ports:
    - "5671:5671"
    - "5672:5672"
    - "15672:15672"

rabbit2:
  image: ipf-queue-node
  hostname: rabbit2
  cap_add:
    - ALL
    - NET_ADMIN
    - SYS_ADMIN
  links:
    - rabbit1
  environment: 
   - CLUSTERED=true
   - CLUSTER_WITH=rabbit1
   - RAM_NODE=true
  ports:
      - "5675:5671"
      - "5673:5672"
      - "15673:15672"

Docker file content:

FROM queue-base

# Create directories
RUN mkdir /opt/rabbit
RUN mkdir /opt/simulator
RUN mkdir /opt/simulator/tools

# Add the files from the local repository into the container
ADD rabbitmq.config     /etc/rabbitmq/
ADD rabbitmq-env.conf   /etc/rabbitmq/
ADD erlang.cookie       /var/lib/rabbitmq/.erlang.cookie
ADD startclusternode.sh /opt/rabbit/
ADD debugnodes.sh       /opt/rabbit/
ADD tl                  /bin/tl
ADD rl                  /bin/rl
ADD rst                 /bin/rst

# Add the simulator tooling
ADD simulator_tools/   /opt/simulator/tools/
ADD ./testca /tmp/ssl
ADD ./server /tmp/ssl

# Set the file permissions in the container
RUN chmod 644 /etc/rabbitmq/rabbitmq.config
RUN chmod 644 /etc/rabbitmq/rabbitmq-env.conf
RUN chmod 400 /var/lib/rabbitmq/.erlang.cookie
RUN chmod 777 /opt/rabbit/startclusternode.sh
RUN chmod 777 /opt/rabbit/debugnodes.sh
RUN chmod 777 /bin/tl
RUN chmod 777 /bin/rl
RUN chmod 777 /bin/rst
RUN chmod -R 777 /opt/simulator

# Set ownership permissions on files in the container
RUN chown rabbitmq:rabbitmq /var/lib/rabbitmq/.erlang.cookie

# Expose ports inside the container to the host
EXPOSE 5672
EXPOSE 5671
EXPOSE 15672
EXPOSE 25672

# Run this to debug the cluster nodes by allowing ssh login
#CMD /opt/rabbit/debugnodes.sh

# Run this to autostart the cluster nodes
CMD /opt/rabbit/startclusternode.sh

startclusternode.sh is the script to setup the cluster:

#!/bin/bash

logfile="/tmp/rabbitnode.log"
firsttimefile="/tmp/firsttimerunning"

curhostname=`hostname`
username="<<username>>"
password="<<password>>"
echo "" > $logfile
echo "New Start Date:" >> $logfile
date >> $logfile
echo "" >> $logfile

( sleep 40 ; \
rabbitmqctl add_user $username $password ; \
rabbitmqctl set_user_tags $username administrator ; \
rabbitmqctl add_vhost $curhostname ; \
rabbitmqctl add_vhost localhost; \
rabbitmqctl set_permissions -p $curhostname $username ".*" ".*" ".*" ; \
rabbitmqctl set_permissions -p localhost $username ".*" ".*" ".*" ; \
rabbitmqctl set_policy ha-all "" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
) & 

sleep 5

# For version 3.5.6 the first time running the cluster needs to enable the plugins
if [ -f $firsttimefile ]; then
  echo "First Time Running Enabling Plugins" >> $logfile
  /usr/sbin/rabbitmq-server -d &
  echo "Waiting for RabbitMQ Server to start" >> $logfile
  sleep 3
  echo "Enabling Plugins" >> $logfile
  /usr/sbin/rabbitmq-plugins enable rabbitmq_stomp rabbitmq_management  rabbitmq_management_agent rabbitmq_management_visualiser rabbitmq_federation rabbitmq_federation_management sockjs >> $logfile
  echo "Waiting for Plugins to finish" >> $logfile
  sleep 1
  echo "Stopping the RabbitMQ using stop_app" >> $logfile
  /usr/sbin/rabbitmqctl stop_app
  echo "Stopping the RabbitMQ using stop" >> $logfile
  /usr/sbin/rabbitmqctl stop

  echo "Stopping the RabbitMQ Server" >> $logfile
  kill -9 `ps auwwx | grep rabbitmq-server | awk '{print $2}'`
  sleep 1

  echo "Done First Time Running Enabling Plugins" >> $logfile
  rm -f $firsttimefile >> $logfile
  echo "Done Cleanup First Time File" >> $logfile


  # Allow the cluster nodes to wait for the master to start the first time
  if [ -z "$CLUSTERED" ]; then
    echo "Ignoring as this is the server node" >> $logfile
  else
    if [ -z "$CLUSTER_WITH" ]; then
      echo "Ignoring as this is the cluster master node" >> $logfile
    else
      echo "Waiting for the master node to start up" >> $logfile
      sleep 5
      echo "Done waiting for the master node to start up" >> $logfile
    fi
  fi
fi


if [ -z "$CLUSTERED" ]; then

  echo "Starting non-Clustered Server Instance" >> $logfile
  # if not clustered then start it normally as if it is a single server
  /usr/sbin/rabbitmq-server  >> $logfile
  echo "Done Starting non-Clustered Server Instance" >> $logfile

  # Tail to keep the foreground process active.
  tail -f /var/log/rabbitmq/*

else
  if [ -z "$CLUSTER_WITH" ]; then
    # If clustered, but cluster is not specified then start normally as this could be the first server in the cluster
    echo "Starting Single Server Instance" >> $logfile
    /usr/sbin/rabbitmq-server >> $logfile

    echo "Done Starting Single Server Instance" >> $logfile
  else
    echo "Starting Clustered Server Instance as a DETACHED single instance" >> $logfile
    /usr/sbin/rabbitmq-server -detached >> $logfile

    echo "Stopping App with /usr/sbin/rabbitmqctl stop_app" >> $logfile
    /usr/sbin/rabbitmqctl stop_app >> $logfile

    # This should attempt to join a cluster master node from the yaml file
    if [ -z "$RAM_NODE" ]; then
      echo "Attempting to join as DISC node: /usr/sbin/rabbitmqctl join_cluster rabbit@$CLUSTER_WITH" >> $logfile
      /usr/sbin/rabbitmqctl join_cluster rabbit@$CLUSTER_WITH >> $logfile
    else
      echo "Attempting to join as RAM node: /usr/sbin/rabbitmqctl join_cluster --ram rabbit@$CLUSTER_WITH" >> $logfile
      /usr/sbin/rabbitmqctl join_cluster --ram rabbit@$CLUSTER_WITH >> $logfile
    fi
    echo "Starting App" >> $logfile
    /usr/sbin/rabbitmqctl start_app >> $logfile

    echo "Done Starting Cluster Node" >> $logfile
  fi

  # Tail to keep the foreground process active.
  tail -f /var/log/rabbitmq/*

fi

Problem is when I tried to do the same setup using kubernetes I am unable to connect to master from slave node. Approach that I took is, I created a pod for master node and another for slave node, passed hostname of the master (currently hard-coded) through environment variable. I also checked the log file at /tmp/rabbitmq.log, it is correctly taking all the environment variables. However it is unable to register with the master. I tried doing it manually also using rabbitmqctl command. But it did't work says host unreachable. Tried changing /etc/hosts file too.

As per my understanding pods in kubernetes communicate through services, I guess because of this, passing directly container hostname doesn't work and rabbitmq clusiering work based on hostnames.

Have anybody tried any workaround? I want to run master and slaves on different nodes. Below are the content of master and slave pods:

apiVersion: v1
kind: Service
metadata:
  name: rabbitmqsvc
  labels:
    app: queue-master
spec:
  ports:
    - port: 5672
      name: queue-rw-port
    - port: 15672
      name: queue-mgt-port
      nodePort: 31606
    - port: 5671
      name: queue-ssl
      nodePort: 32718
  selector:
    app: queue-master
  type: NodePort
  clusterIP: 10.16.0.121
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: rabbitmq-controller
  labels:
    app: queue-master
spec:
  replicas: 1
  selector:
    app: queue-master
  template:
    metadata:
      name: rabbitmq-pod
      labels:
        app: queue-master
    spec:
      nodeSelector:
        nodesize: small1
      containers:
      - name: rabbitmq-master
        image: 172.17.0.1:5000/queue-node
        ports:
        - containerPort: 5672
          name: queue-rw-port
        - containerPort: 15672
          name: queue-mgt-port
        - containerPort: 5671
          name: queue-ssl

SLAVE:

apiVersion: v1
kind: Service
metadata:
  name: rabbitmqsvc-slave
  labels:
    app: queue-slave
spec:
  ports:
    - port: 5672
      name: queue-rw-port
    - port: 15672
      name: queue-mgt-port
      nodePort: 31607
    - port: 5671
      name: queue-ssl
      nodePort: 32719
  selector:
    app: queue-slave
  type: NodePort
  clusterIP: 10.16.0.122
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: rabbitmq-controller-slave
  labels:
    app: queue-slave
spec:
  replicas: 1
  selector:
    app: queue-slave
  template:
    metadata:
      name: rabbitmq-pod
      labels:
        app: queue-slave
    spec:
      nodeSelector:
        nodesize: small2
      containers:
      - name: rabbitmq-slave
        image: 172.17.0.1:5000/queue-node
        env:
        - name: CLUSTERED
          value: "true"
        - name: CLUSTER_WITH
          value: "rabbitmq-controller-2ll1s"
        - name: RAM_NODE
          value: "true"
        - name: HOST_NAME
          value: "rabbit2"
        ports:
        - containerPort: 5672
          name: queue-rw-port
        - containerPort: 15672
          name: queue-mgt-port
        - containerPort: 5671
          name: queue-ssl

Jan Garaj · Accepted Answer

You can't use Docker hostnames and links with Kubernetes. It works only when you have one machine, but Kubernetes is designated to be clustered solution.

Enable DNS in your cluster. Create Master service (rabbitmqsvc - already done) and then use DNS name of your Master service in you slave configuration:

    - name: CLUSTER_WITH
      value: "rabbitmqsvc.svc.cluster.local"

You can use also environment variables, but IMHO DNS is better option.

Erez Rabih · Answer

We just open sourced a deployment ready rabbitmq cluster for kubernetes. It uses StatefulSets so it requires Kubernetes 1.5.X or later.

You can find it here: https://github.com/nanit/kubernetes-rabbitmq-cluster

Rabbitmq cluster setup in Kubernetes

Tags:

docker

rabbitmq

kubernetes

Vivek

2 Answers

Jan Garaj

Erez Rabih

Recent Activity

Donate For Us

Rabbitmq cluster setup in Kubernetes

Tags:

docker

rabbitmq

kubernetes

Vivek

2 Answers

Jan Garaj

Erez Rabih

Related questions

Recent Activity

Donate For Us