Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Increase number of shards in DynamoDB to spin up more lambdas in parallel

I'm currently using DynamoDB streams to process changed collection values with lambda functions, however, currently, I'm only running two lambda instances in parallel, which is not enough to process all the incoming data and lambda functions are just queued up.

From aws documentation I can see that number of lambdas that can run in parallel is proportional to the number of shards of your DynamoDB:

If you create a Lambda function that processes events from stream-based services (Amazon Kinesis Streams or DynamoDB streams), the number of shards per stream is the unit of concurrency. If your stream has 100 active shards, there will be 100 Lambda functions running concurrently. Then, each Lambda function processes events on a shard in the order that they arrive.

So my question is, how do I increase the number of shards of my DynamoDB? Is it even possible? I couldn't find how to set it up in the settings.

like image 323
inside Avatar asked Feb 17 '17 18:02

inside


People also ask

How many lambdas are in a Kinesis shard?

Lambda can process up to 10 batches in each shard simultaneously. If you increase the number of concurrent batches per shard, Lambda still ensures in-order processing at the partition-key level.

How many shards are there in DynamoDB?

By default it's 100 and if you have more than 100 shards which tries to execute more than 100 lambda executions concurrently it will hit the limits.

Does DynamoDB support sharding?

Partitions, keys, and write sharding At creation, DynamoDB evenly distributes the table's capacity across the underlying partitions.


1 Answers

No, its not possible to manually control number of shards in DDB UpdateStream. DDB automatically handles that for you by creating as many shards to match the incoming rate of updates.

Ideally updates happening to your DDB table is supposed to flow through some shard (updates happening to same record will always go to same shard meaning they are partitioned based on your hashKey). It is your stream of updates that too in chronological order thus updates over same record end up (or say queued up) in same shard so that end processor process them in sequence they happened.

Each shard has its own throughput capacity for in and out of data unless there is need of more shards to support in coming rate of updates on table (which in case of DDB updates streams is high write tps on your table, which current number of shards can't handle)

like image 80
gitesh.tyagi Avatar answered Oct 21 '22 08:10

gitesh.tyagi