Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kinesis Lambda Consumer Minimum Batch Size

I'm using AWS Lambda (node.js) as a AWS Kinesis Consumer. I can see that you can set a maximum batch size, but I'm wondering if I can set a minimum batch size. So that I can insure that each lambda will handle at least 50 (or any number) records.

I'd like to have a minimum batch size because the lambda consumer will be establishing a connection to a RDS MySQL instance and I'm trying to keep the number of concurrent connections low.

If there isn't a config capability that would set a minimum, any workaround ideas would be appreciated.

Thanks.

like image 881
cardosi Avatar asked Sep 01 '17 23:09

cardosi


People also ask

What is batch size in Kinesis?

Batch size – The number of records to send to the function in each batch, up to 10,000. Lambda passes all of the records in the batch to the function in a single call, as long as the total size of the events doesn't exceed the payload limit for synchronous invocation (6 MB).

What is Lambda batch size?

Batch size – The number of records to send to the function in each batch. For a standard queue, this can be up to 10,000 records. For a FIFO queue, the maximum is 10. For a batch size over 10, you must also set the MaximumBatchingWindowInSeconds parameter to at least 1 second.

How many consumers can a Kinesis stream have?

You can register up to 20 consumers per data stream. A given consumer can only be registered with one data stream at a time. Only 5 consumers can be created simultaneously. In other words, you cannot have more than 5 consumers in a CREATING status at the same time.

Is Kinesis cheaper than Kafka?

Kafka requires more engineering hours for implementation and maintenance leading to a higher total cost of ownership (TCO). As an AWS cloud-native service, Kinesis supports a pay-as-you-go model leading to lower costs to achieve the same outcome.


1 Answers

One way could be to use Kinesis Firehose, which concatenates multiple incoming records based on buffering configuration of your delivery stream.

  1. Send data to Firehose - Either directly put records to Firehose Stream using their API, or attach the Firehose to your existing kinesis stream.
  2. Set S3 as your Firehose's destination - This will essentially aggregate your individual records, and put them in S3 as a single object. You can specify your delimiters, and even transformation lambda functions on single records.
  3. Listen for S3:PutObject - Attach your lambda to listen to the S3 bucket which receives these aggregated records from the Firehose stream.
like image 175
John Bupit Avatar answered Oct 16 '22 15:10

John Bupit