I am trying to understand how to deploy an Amazon Kinesis Client application that was built using the Kinesis client library (KCL).
I found this but it only states
You can follow your own best practices for deploying code to an Amazon EC2 instance when you deploy a Amazon Kinesis application. For example, you can add your Amazon Kinesis application to one of your Amazon EC2 AMIs.
which is not giving a broader picture to me.
These examples use an Ant script to run Java program. Is this the best practice to follow?
Also, I understand even before running the EC2 instances I need to make sure
Could someone please add some more detail on this?
Docker Kinesis Local With Amazon Kinesis Client Library (KCL), you can build Amazon Kinesis Applications and use streaming data to power real-time dashboards, generate alerts, implement dynamic pricing and advertising, and more.
To put data into the stream, you must specify the name of the stream, a partition key, and the data blob to be added to the stream. The partition key is used to determine which shard in the stream the data record is added to. All the data in the shard is sent to the same worker that is processing the shard.
Amazon Kinesis will be responsible for ingesting data, not running your application. You can run your application anywhere, but it is a good idea to run it in EC2, as you are probably going to use other AWS Services, such as S3 or DynamoDB (Kinesis Client Library uses DynamoDB for sharding, for example).
To understand Kinesis better, I'd recommend that you launch the Kinesis Data Visualization Sample. When you launch this app, use the provided CloudFormation template. It will create a stack with the Kinesis stream and an EC2 instance with the application, that uses Kinesis Client Library and is a fully working example to start from.
The best way I have found to host a consumer program is using EMR, but not as a hadoop cluster. Package your program as a jar, and place it in s3. Launch an emr cluster and have it run your jar. Using the data pipeline you can schedule this job flow to run at regular intervals. You can also scale an emr cluster, or use a actual EMR job to process the stream if you choose to get the high tech.
You can also use Beanstalk. I believe this article is highly useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With