Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a Dockerfile for cassandra (or any database) that includes a schema?

I would like to create a dockerfile that builds a Cassandra image with a keyspace and schema already there when the image starts.

In general, how do you create a Dockerfile that will build an image that includes some step(s) that can't really be done until the container is running, at least the first time?

Right now, I have two steps: build the cassandra image from an existing cassandra Dockerfile that maps a volume with the CQL schema files into a temporary directory, and then run docker exec with cqlsh to import the schema after the image has been started as a container.

But that doesn't create an image with the schema - just a container. That container could be saved as an image, but that's cumbersome.

    docker run --name $CASSANDRA_NAME -d \
        -h $CASSANDRA_NAME \
        -v $CASSANDRA_DATA_DIR:/data \
        -v $CASSANDRA_DIR/target:/tmp/schema \
        tobert/cassandra:2.1.7

then

docker exec $CASSANDRA_NAME cqlsh  -f /tmp/schema/create_keyspace.cql
docker exec $CASSANDRA_NAME cqlsh  -f /tmp/schema/schema01.cql
# etc

This works, but it makes it impossible to use with tools like Docker compose since linked containers/services will start up too and expect the schema to be in place.

I saw one attempt where the cassandra process as attempted to be started in the background in the Dockerfile during build, then cqlsh run, but I don't think that worked too well.

like image 301
adapt-dev Avatar asked Jan 20 '16 20:01

adapt-dev


People also ask

Which command is used for creating the Dockerfile?

The docker build command builds Docker images from a Dockerfile and a “context”. A build's context is the set of files located in the specified PATH or URL . The build process can refer to any of the files in the context.


2 Answers

Ok I had this issue and someone advised me some strategy to deal with:

  1. Start from an existing Cassandra Dockerfile, the official one for example
  2. Remove the ENTRYPOINT stuff
  3. Copy the schema (.cql) file and data (.csv) into the image and put it somewhere, /opt/data for example
  4. create a shell script that will be used as the last command to start Cassandra

    a. start cassandra with $CASSANDRA_HOME/bin/cassandra

    b. IF there is a $CASSANDRA_HOME/data/data/your_keyspace-xxxx folder and it's not empty, do nothing more

    c. Else

    1. sleep some time to allow the server to listen on port 9042
    2. when port 9042 is listening, execute the .cql script to load csv files
    

I found this procedure rather cumbersome but there seems to be no other way around. For Cassandra hands-on lab, I found it easier to create a VM image using Vagrant and Ansible.

like image 63
doanduyhai Avatar answered Oct 13 '22 15:10

doanduyhai


Make a docker file Dockerfile_CAS:


FROM cassandra:latest

COPY ddl.cql docker-entrypoint-initdb.d/

COPY docker-entrypoint.sh /docker-entrypoint.sh

RUN ls -la *.sh; chmod +x *.sh; ls -la *.sh

ENTRYPOINT ["/docker-entrypoint.sh"]

CMD ["cassandra", "-f"]


edit docker-entrypoint.sh, add

for f in docker-entrypoint-initdb.d/*; do case "$f" in *.sh) echo "$0: running $f"; . "$f" ;; *.cql) echo "$0: running $f" && until cqlsh -f "$f"; do >&2 echo "Cassandra is unavailable - sleeping"; sleep 2; done & ;; *) echo "$0: ignoring $f" ;; esac echo done

above exec "$@"


docker build -t suraj1287/cassandra -f Dockerfile_CAS .

and rebuild the image...

like image 39
suraj1287 Avatar answered Oct 13 '22 15:10

suraj1287