What is the best practice for moving data from a Kafka cluster to a Redshift table? We have continuous data arriving on Kafka and I want to write it to tables in Redshift (it doesn't have to be in real time).
Kafka Connect is commonly used for streaming data from Kafka to (and from) data stores. It does useful things like automagically managing scaleout, fail over, schemas, serialisation, and so on.
This blog shows how to use the open-source JDBC Kafka Connect connector to stream to Redshift. There is also a community Redshift connector, but I've not tried this.
This blog shows another approach, not using Kafka Connect.
Disclaimer: I work for Confluent, who created the JDBC connector.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With