Kafka like offset on Kinesis Stream?

Tags:

I have worked a bit with Kafka in the past and lately there is a requirement to port part of the data pipeline on AWS Kinesis Stream. Now I have read that Kinesis is effectively a fork of Kafka and share many similarities.

However I have failed to see how can we have multiple consumers reading from the same stream, each with their corresponding offset. There is a sequence number given to each data record, but I couldn't find anything specific to consumer(Kafka group Id?).

Is it really possible to have different consumers with different ingestion rate over same AWS Kinesis Stream?

579

asked Mar 16 '17 04:03

Mangat Rai Modi

1 Answers

Yes.

You can have multiple Kinesis Consumer Applications. Let's say you have 2.

First consumer application (I think it is "consumer group" in Kafka?) can be "first-app" and store it's positions in the DynamoDB "first-app-table". It can have as many nodes (ec2 instances) as you want.
Second consumer application can also work on the same stream, and store it's positions on another DynamoDB table let's say "second-app-table".

Each table will contain "what is the last processed position on shard X for app Y" information. So the 2 applications store checkpoints for the same shards in a different place, which makes them independent.

About the ingestion rate, there is a "idleTimeBetweenReadsInMillis" value in consumer applications using KCL, that is the polling interval for Amazon Kinesis API for Get operations. For example first application can have "2000" poll interval, so it will poll stream's shards every 2 seconds to see if any new record came.

I don't know Kafka well but as far as I remember; Kafka "partition" is "shard" in Kinesis, likewise Kafka "offset" is "sequence number" in Kinesis. Kinesis Consumer Library uses the term "checkpoint" for the stored sequences. Like you said, the concepts are similar.

answered Oct 28 '22 23:10

az3

Related questions
                            
                                How to get the price of a running EC2 spot instance?
                            
                                AWS - Elastic Beanstalk CLI - how to list/change profiles
                            
                                How can I package or install an entire program to run in an AWS Lambda function
                            
                                Connect IntelliJ to Amazon Redshift
                            
                                Block docker access to specific IP
                            
                                Cannot catch AWS S3 exception
                            
                                AWS Lambda and zip upload from S3
                            
                                Redshift Error 1202 "Extra column(s) found" using COPY command
                            
                                How to forward non WWW to WWW with AWS Amazon Cloud front behind HTTPS cloud front
                            
                                How to config AWS CodeCommit config file for an specific repo
                            
                                S3Client copyObject cross regiond
                            
                                AWS EFS from Windows Server 2012 [closed]
                            
                                Amazon S3 static site serves old contents
                            
                                value of max_connections in AWS RDS
                            
                                Boto3: Get EC2 images owned by me
                            
                                Nginx url limit 502 gateway
                            
                                Invoke AWS Lambda and return response to API Gateway asyncronously
                            
                                AWS API Gateway - How do I get the date/timestamp/epoch in a body mapping template?
                            
                                Passing ARN reference from CloudFormation to Swagger
                            
                                Resize Shared buffer size in Postgresql hosted in AWS RDS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Kafka like offset on Kinesis Stream?

Tags:

amazon-web-services

amazon-kinesis

Mangat Rai Modi

People also ask

1 Answers

az3

Recent Activity

Donate For Us