I would like to create keyspaces and column-families at the start of my Cassandra container.
I tried the following in a docker-compose.yml
file:
# shortened for clarity
cassandra:
hostname: my-cassandra
image: my/cassandra:latest
command: "cqlsh -f init-database.cql"
The image my/cassandra:latest
contains init-database.cql
in /
. But this does not seem to work.
Is there a way to make this happen ?
Go to data tab > there you will see all keyspcaces created by you and some system keyspaces. You can see all tables under individual keyspaces and also replicator factor for keyspace.
I was also searching for the solution to this question, and here is the way how I accomplished it.
Here the second instance of Cassandra has a volume with the schema.cql and runs CQLSH command
My Version with healthcheck so we can get rid of sleep command
version: '2.2'
services:
cassandra:
image: cassandra:3.11.2
container_name: cassandra
ports:
- "9042:9042"
environment:
- "MAX_HEAP_SIZE=256M"
- "HEAP_NEWSIZE=128M"
restart: always
volumes:
- ./out/cassandra_data:/var/lib/cassandra
healthcheck:
test: ["CMD", "cqlsh", "-u cassandra", "-p cassandra" ,"-e describe keyspaces"]
interval: 15s
timeout: 10s
retries: 10
cassandra-load-keyspace:
container_name: cassandra-load-keyspace
image: cassandra:3.11.2
depends_on:
cassandra:
condition: service_healthy
volumes:
- ./src/main/resources/cassandra_schema.cql:/schema.cql
command: /bin/bash -c "echo loading cassandra keyspace && cqlsh cassandra -f /schema.cql"
NetFlix Version using sleep
version: '3.5'
services:
cassandra:
image: cassandra:latest
container_name: cassandra
ports:
- "9042:9042"
environment:
- "MAX_HEAP_SIZE=256M"
- "HEAP_NEWSIZE=128M"
restart: always
volumes:
- ./out/cassandra_data:/var/lib/cassandra
cassandra-load-keyspace:
container_name: cassandra-load-keyspace
image: cassandra:latest
depends_on:
- cassandra
volumes:
- ./src/main/resources/cassandra_schema.cql:/schema.cql
command: /bin/bash -c "sleep 60 && echo loading cassandra keyspace && cqlsh cassandra -f /schema.cql"
P.S I found this way at one of the Netflix Repos
We recently tried to solve a similar problem in KillrVideo, a reference application for Cassandra. We are using Docker Compose to spin up the environment needed by the application which includes a DataStax Enterprise (i.e. Cassandra) node. We wanted that node to do some bootstrapping the first time it was started to install the CQL schema (using cqlsh
to run the statements in a .cql
file just like you're trying to do). Basically the approach we took was to write a shell script for our Docker entrypoint that:
cqlsh -f
to run some CQL statements and init the schema.We just use the existence of a file to indicate whether the node has already been bootstrapped and check that on startup to determine whether we need to do that logic above or can just start it normally. You can see the results in the killrvideo-dse-docker repository on GitHub.
There is one caveat to this approach. This worked great for us because in our reference application, we're only spinning up a single node (i.e. we aren't creating a cluster with more than one node). If you're running multiple nodes, you'll probably want to make sure that only one of the nodes does the bootstrapping to create the schema because multiple clients modifying the schema simultaneously can cause some issues with your cluster. (This is a known issue and will hopefully be fixed at some point.)
I solved this problem by patching cassandra's docker-entrypoint.sh
so it will execute sh
and cql
files located in /docker-entrypoint-initdb.d
on startup. This is similar to how MySQL docker containers work.
Basically, I add a small script at the end of the docker-entrypoint.sh
(right before the last line, exec "$@"
), that will run the cql scripts once cassandra is up. A simplified version is:
INIT_DIR=docker-entrypoint-initdb.d
# this whole block will execute in the background
(
cd $INIT_DIR
# wait for cassandra to be ready
while ! cqlsh -e 'describe cluster' > /dev/null 2>&1; do sleep 6; done
echo "$0: Cassandra cluster ready: executing cql scripts found in $INIT_DIR"
# find and execute cql scripts, in name order
for f in $(find . -type f -name "*.cql" -print | sort); do
echo "$0: running $f"
cqlsh -f "$f"
echo "$0: $f executed"
done
) &
This solution works for all cassandra versions (at least until 3.11, as the time of writing).
Hence, you only have to build and use this cassandra image version, and then add proper initializations scripts to the container using docker-compose volumes.
A complete gist with a more robust entrypoint patch (and example) is available here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With