Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiple consumers per kinesis shard

I read you can have multiple consumer apps per kinesis stream.

http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html

however, I heard you can only have on consumer per shard. Is this true? I don't find any documentation to support this, and can't imagine how that could be if multiple consumers are reading from the same stream. Certainly, it doesn't mean the producer needs to repeat content in different shards for different consumers.

like image 419
bhomass Avatar asked Dec 29 '15 01:12

bhomass


People also ask

Can multiple consumers read from same shard in Kinesis?

Kinesis allows multiple consumers to read from the same shard concurrently. So a shard containing user events related to sign-up can allow different consumers to store the data, provision data science models, and fulfill any other purpose.

Can Kinesis stream have multiple consumers?

You can register up to 20 consumers per stream. A given consumer can only be registered with one stream at a time. For an example of how to use this operations, see Enhanced Fan-Out Using the Kinesis Data Streams API. The use of this operation has a limit of five transactions per second per account.

How many shards can a Kinesis stream have?

The throughput of a Kinesis data stream is designed to scale without limits. The default shard quota is 500 shards per stream for the following AWS Regions: US East (N. Virginia), US West (Oregon), and Europe (Ireland). For all other Regions, the default shard quota is 200 shards per stream.


1 Answers

Kinesis Client Library starts threads in the background, each listens to 1 shard in the stream. You cannot connect to a shard over multiple threads, that is by-design.

http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-scaling.html

For example, if your application is running on one EC2 instance, and is processing one Amazon Kinesis stream that has four shards. This one instance has one KCL worker and four record processors (one record processor for every shard). These four record processors run in parallel within the same process.

In the explanation above, the term "KCL worker" refers to a Kinesis consumer application. Not the threads.

But below, the same "KCL worker" term refers to a "Worker" thread in the application; which is a runnable.

Typically, when you use the KCL, you should ensure that the number of instances does not exceed the number of shards (except for failure standby purposes). Each shard is processed by exactly one KCL worker and has exactly one corresponding record processor, so you never need multiple instances to process one shard.

See the Worker.java class in KCL source.

like image 178
az3 Avatar answered Sep 30 '22 14:09

az3