How we can use Kafka Connect with Cassandra without using the Confluent frameworks.
The DataMountaineer Stream Reactor has a Cassandra Source and Sink solution that can be used with Kafka Connect.
Drop the jar file (download) into the Kafka libs folder and then specify your connector as follows:
{
"name": "cassandra-NAME",
"config": {
"tasks.max": "1",
"connector.class": "com.datamountaineer.streamreactor.connect.cassandra.source.CassandraSourceConnector",
"connect.cassandra.key.space": "KEYSPACE",
"connect.cassandra.source.kcql": "INSERT INTO KAFKA_TOPIC SELECT column1, timestamp_col FROM CASSANDRA_TABLE PK timestamp_col",
"connect.cassandra.import.mode": "incremental",
"connect.cassandra.contact.points": "localhost",
"connect.cassandra.port": 9042,
"connect.cassandra.import.poll.interval": 10000
}}
Start Kafka Connect
bin/connect-distributed.sh config/connect-distributed.properties
And load the Cassandra Connector into Kafka Connect via the JSON properties file noted above (assuming it has the name connect-cassandra-source.json)
curl -X POST -H "Content-Type: application/json" -d @config/connect-cassandra-source.json localhost:8083/connectors
You will need to create a table that has a timeuuid column as a cluster key. That is described here.
Kafka Connect is the framework. Confluent only offers connectors. If you don't want to use Confluent Open Source (but why wouldn't you?), you can use all those connectors with vanilla Apache Kafka, too.
There are multiple Casandra connectors available: https://www.confluent.io/product/connectors/
Btw: none of the listed Casandra connectors is maintained by Confluent.
Of course, you could also write you own connector or use any other third party connector.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With