Logo Questions Linux Laravel Mysql Ubuntu Git Menu

The "--cluster-store" and "--cluster-advertise" don't work



I try to setup docker cluster with swarm and consul. I have manager, host1, and host2.
I run consul and swarm manager containers on the manager.

$ docker run --rm -p 8500:8500 progrium/consul -server -bootstrap
$ docker run -d -p 2377:2375 swarm manage consul://<manager>:8500

On host1 and host2, I modify the daemon options with --cluster-store and --cluster-advertise, and restart docker daemon.

DOCKER_OPTS="--cluster-store=consul://<manager>:8500 --cluster-advertise=<host1>:2375"
DOCKER_OPTS="--cluster-store=consul://<manager>:8500 --cluster-advertise=<host2>:2375"

When I join host1 and host2 to the swarm, it fails.

host1 $ docker run --rm swarm join --advertise=<host1>:2375 consul://<manager>:8500
host2 $ docker run --rm swarm join --advertise=<host2>:2375 consul://<manager>:8500

From the swarm manager log, it error out.

time="2016-01-20T02:17:17Z" level=error msg="Get http://<host1>:2375/v1.15/info: dial tcp <host1>:2375: getsockopt: connection refused"
time="2016-01-20T02:17:20Z" level=error msg="Get http://<host2>:2375/v1.15/info: dial tcp <host2>:2375: getsockopt: connection refused"
like image 419
firelyu Avatar asked Jan 20 '16 05:01


1 Answers

Since i've come about a similar problem aswell i did eventually find out why it didn't work (in my example I'm using multiple boxes on a LAN that I want to manage from in there and only allow access from the outside to certain containers -- the following examples are run on the box at

  • set up the Daemons with --cluster-store consul:// and port 8500 (deploying Consul & registrator on each Daemon as the first containers) and --cluster-advertise aswell as -H tcp:// -H unix:///var/run/docker.sock -H tcp:// (i do not however bind to the other available addresses as you would with tcp:// and instead only bind to the local In case you want containers only binding to the local network aswell (as i did in this case) you can specify the additional --ip parameter for the Daemon - when containers should be available to everywhere else aswell (in my case only an nginx load balancer with failover via keepalived) you specify binding the port to all interfaces docker run ... -p ... <image>
  • Start the Daemons
  • Deploy gliderlabs/registrator and Consul with compose (this is an example from the first box in my setup but I start the equivalent on all Daemons for a complete Consul HA failover setup) docker-compose -p bootstrap up -d (naming the containers bootstrap_registrator_1 and bootstrap_consul_1 in the private network bootstrap):

    version: '2'
        image: gliderlabs/registrator
        command: consul://
          - consul
          - /var/run/docker.sock:/tmp/docker.sock
        restart: unless-stopped
        image: consul
        command: agent -server -bootstrap -ui -advertise -client
        hostname: srv-0
        network_mode: host
          - "8300:8300"     # Server RPC, Server Use Only
          - "8301:8301/tcp" # Serf Gossip Protocol for LAN
          - "8301:8301/udp" # Serf Gossip Protocol for LAN
          - "8302:8302/tcp" # Serf Gossip Protocol for WAN, Server Use Only
          - "8302:8302/udp" # Serf Gossip Protocol for WAN, Server Use Only
          - "8400:8400"     # CLI RPC
          - "8500:8500"     # HTTP API & Web UI
          - "53:8600/tcp"   # DNS Interface
          - "53:8600/udp"   # DNS Interface
        restart: unless-stopped
  • now the Daemons register and set locks on the KV-store (Consul) in docker/nodes and Swarm does not automatically seem to read from this location.. So when it tries to read which Daemons are available it doesn't find any. Now this bit cost me the most time: To solve this I had to specify --discovery-opt kv.path=docker/nodes and start Swarm with docker-compose -p bootstrap up -d - on all boxes aswell to end up with a Swarm HA failover of managers:

    version: '2'
        image: swarm
        command: manage -H :3375 --replication --advertise --discovery-opt kv.path=docker/nodes consul://
        hostname: srv-0
          - "" #
        restart: unless-stopped
  • Now I end up with a working Swarm that is only available on the network on port 3375. All containers that are started are only available to this network aswell unless i specify -p when starting (with docker run)

  • Further scaling: When I add more boxes to the local network to grow the capacity my idea would be to add more Daemons and maybe non-manager Swarm instances with those aswell as later Consul clients (rather than servers, started with -server).
like image 163
Jan Avatar answered Sep 25 '22 02:09
