Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to copy data in bulk from Kinesis -> Redshift

When i read about AWS data pipeline the idea immediately struck - produce statistics to kinesis and create a job in pipeline that will consume data from kinesis and COPY it to redshift every hour. All in one go.

But it seems there is no node in pipeline that can consume kinesis. So now i have two possible plans of action:

  1. Create instance where Kinesis's data will be consumed and sent to S3 split by hours. Pipeline will copy from there to Redshift.
  2. Consume from Kinesis and produce COPY directly to Redshift on the spot.

What should I do? Is there no way to connect Kinesis to redshift using AWS services only, without custom code?

like image 407
FXGlory Avatar asked Nov 21 '14 16:11

FXGlory


2 Answers

It is now possible to do so without user-code via a new managed service called Kinesis Firehose. It manages the desired buffering intervals, temp uploads to s3, upload to Redshift, error handling and auto throughput management.

like image 83
Froyke Avatar answered Sep 20 '22 18:09

Froyke


That is already done for you! If you use the Kinesis Connector Library, there is a built-in connector to Redshift

https://github.com/awslabs/amazon-kinesis-connectors

Depending on the logic you have to process connector can be really easy to implement.

like image 33
Alexandre Rondeau Avatar answered Sep 18 '22 18:09

Alexandre Rondeau