I'm trying to deploy nsqlookupd using fleet on a brand shiny new coreos cluster in EC2. Here is my systemd unit file:
[Unit]
Description=nsqlookupd service
After=docker.service
Requires=docker.service
[Service]
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/docker kill nsqlookupd
ExecStartPre=-/usr/bin/docker rm nsqlookupd
ExecStart=/usr/bin/docker run -d --name=nsqlookupd -e BROADCAST_ADDRESS=$COREOS_PUBLIC_IPV4 -p 4160:4160 -p 4161:4161 mikedewar/nsqlookupd
ExecStartPost=/usr/bin/etcdctl set /nsqlookupd_broadcast_address $COREOS_PUBLIC_IPV4
ExecStop=/usr/bin/docker stop -t 1 nsqlookupd
ExecStopPost=/usr/bin/etcdctl rm /nsqlookupd_broadcast_address
I've verified the container works fine if I just run the ExecStart
command. My docker logs just look like
~ $ docker logs nsqlookupd
2014/08/08 02:23:58 nsqlookupd v0.2.29-alpha (built w/go1.2.2)
2014/08/08 02:23:58 TCP: listening on [::]:4160
2014/08/08 02:23:58 HTTP: listening on [::]:4161
and my fleetctl journal looks like
$ fleetctl journal nsqlookupd.service
-- Logs begin at Sun 2014-08-03 12:49:00 UTC, end at Fri 2014-08-08 02:30:06 UTC. --
Aug 08 02:23:57 ip-10-147-9-249 systemd[1]: Starting nsqlookupd service...
Aug 08 02:23:57 ip-10-147-9-249 docker[6140]: Error response from daemon: No such container: nsqlookupd
Aug 08 02:23:57 ip-10-147-9-249 docker[6140]: 2014/08/08 02:23:57 Error: failed to kill one or more containers
Aug 08 02:23:57 ip-10-147-9-249 docker[6148]: Error response from daemon: No such container: nsqlookupd
Aug 08 02:23:57 ip-10-147-9-249 docker[6148]: 2014/08/08 02:23:57 Error: failed to remove one or more containers
Aug 08 02:23:57 ip-10-147-9-249 etcdctl[6157]: 54.198.93.169
Aug 08 02:23:57 ip-10-147-9-249 systemd[1]: Started nsqlookupd service.
Aug 08 02:23:57 ip-10-147-9-249 docker[6155]: 0fce4465f61c092541ba9d4c4e89ce13c4d6bedc096519034ed585d7adb5e0d7
Aug 08 02:23:59 ip-10-147-9-249 docker[6194]: nsqlookupd
both of which look just fine. But the container dies quietly, and my fleetctl list-units gives
$ fleetctl list-units
UNIT STATE LOAD ACTIVE SUB DESC MACHINE
nsqlookupd.service launched loaded deactivating stop nsqlookupd service 1320802c.../10.147.9.249
Running docker images
is a little worrying:
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> 8ef9d8f9d18d 9 minutes ago 710 MB
mikedewar/nsqadmin latest 432af572bda8 2 days ago 710 MB
mikedewar/nsqd latest 00bd4e474964 2 days ago 710 MB
<none> <none> adf0ed97208e 3 weeks ago 710 MB
mikedewar/nsqlookupd latest 2219c0e783d9 3 weeks ago 710 MB
<none> <none> 35d2212f8932 3 weeks ago 710 MB
mikedewar/nsq latest f9794fe056e1 3 weeks ago 710 MB
busybox latest a9eb17255234 9 weeks ago 2.433 MB
zmarcantel/cassandra latest b1168b45b4f8 4 months ago 738 MB
as I've been updating mikedewar/nsqlookupd quite regularly over the last 3 weeks. Maybe that's the time I first pushed something to docker hub? I'd love to know that the image I'm working with is the up-to-date one. I've tried docker rmi mikedewar/nsqlookupd
followed by docker pull mikedewar/nsqlookupd
but the CREATED
column still says it was created 3 weeks ago.
I don't know if this is useful, but the ExecStopPost=/usr/bin/etcdctl rm /nsqlookupd_broadcast_address
command seems to have worked - the etcdctl
log line in the fleet journal suggests I managed to set the key to my IP, but after the container dies I can't get that key from etcd.
Any help on where to look next for clues, or any ideas why this is happening would be greatly appreciated! As is probably clear I'm rather new to this sort of thing...
Reason: On December 22, 2016, CoreOS announced that it will no longer maintain fleet; It will receive security updates and bug fixes until February of 2017, when it will be removed from CoreOS. The project recommends using Kubernetes for all clustering needs.
See Instead: For guidance using Kubernetes on CoreOS without fleet, see the Kubernetes on CoreOS Documentation. CoreOS provides an excellent environment for managing Docker containers across multi-server environments. One of the most essential components for making this cluster management simple is a service called fleet.
One of the most common issues that new and experienced CoreOS users run into when a cluster is failing to come up correctly is an invalid cloud-config file. CoreOS requires that a cloud-config file be passed into your server upon creation.
When your CoreOS machine processes the cloud-config file, it generates stub systemd unit files that it uses to start up fleet and etcd. To see the systemd configuration files that were created and are being used to start your services, change to the directory where they were dropped:
You shouldn't run docker containers in detached mode in a unit file. Your execstart contains it: ExecStart=/usr/bin/docker run -d
. This will cause systemd to think the process exited immediately since it was forked into the background.
As for managing versions, if you want to be absolutely sure you're getting the latest copy, you should tag your containers and then pull mikedewar/nsqlookupd:1.2.3. You can increment this each time in your fleet unit file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With