How to retrieve a row's position within a DynamoDB global secondary index and the total?

Tags:

I'm implementing a leaderboard which is backed up by DynamoDB, and their Global Secondary Index, as described in their developer guide, http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html

But, two of the things that are very necessary for a leaderboard system is your position within it, and the total in a leaderboard, so you can show #1 of 2000, or similar.

Using the index, the rows are sorted the correct way, and I'd assume these calls would be cheap enough to make, but I haven't been able to find a way, as of yet, how to do it via their docs. I really hope I don't have to get the entire table every single time to know where a person is positioned in it, or the count of the entire table (although if that's not available, that could be delayed, calculated and stored outside of the table at scheduled periods).

I know DescribeTable gives you information about the entire table, but I would be applying filters to the range key, so that wouldn't suit this purpose.

904

asked Mar 12 '15 02:03

seaders

3 Answers

I am not aware of any efficient way to get the ranking of a player. The dumb way is to do a query starting from the player with the highest point, move downward, keep incrementing your counter until you reach the target player. So for the user with lowest point, you might end up scanning the whole range.

That being said, you can still get the top 100 player with no problem (Leaders). Just do a query starting from the player with the highest point, and set the query limit to 100.

Also, for a given player, you can get 100 players around him with similar points. You just need do two queries like:

query with hashkey="" and rangekey <= his point, limit 50
query with hashkey="" and rangekey >= his point, limit 50

199

answered Nov 18 '22 02:11

Erben Mo

This was the exact same problem we were facing when we were developing our app. Following are two solutions we had come with to deal with this problem:

Query your index with scanIndex->false that will give you all top players (assuming your score/points key in range) with limit 1000. Then applying this mathematical formula y = mx+b where you can take 2 iteration, mostly 1 and last value to find out m and b, x-points, and y-rank. Based on this you will get the rank if you have user's points (this will not be exact rank value it would be approximate, google does the same if we search some thing in our mail it show

and not exact value in first call.
Get all the records and store it in cache until the next update. This is by far the best and less expensive thing we are using.

answered Nov 18 '22 03:11

Harshal Bulsara

The beauty of DynamoDB is that it is highly optimized for very specific (and common) use cases. The cost of this optimization is that many other use cases cannot be achieved as easily as with other databases. Unfortunately yours is one of them. That being said, there are perfectly valid and good ways to do this with DynamoDB. I happen to have built an application that has the same requirement as yours.

What you can do is enable DynamoDB Streams on your table and process item update events with a Lambda function. Every time the number of points for a user changes you re-compute their rank and update your item. Even if you use the same scan operation to re-compute the rank, this is still much better, because it moves the bulk of the cost from your read operation to your write operation, which is kind of the point of NoSQL in the first place. This approach also keeps your point updates fast and eventually consistent (the rank will not update immediately, but is guaranteed to update properly unless there's an issue with your Lambda function).

I recommend to go with this approach and once you reach scale optimize by caching your users by rank in something like Redis, unless you have prior experience with it and can set this up quickly. Pick whatever is simplest first. If you are concerned about your leaderboard changing too often, you can reduce the cost by only re-computing the ranks of first, say, 100 users and schedule another Lambda function to run every several minutes, scan all users and update their ranks all at the same time.

answered Nov 18 '22 04:11

dols3m

Related questions
                            
                                Connecting DynamoDB from Spark program to load all items from one table using Python?
                            
                                Writing From Spark to DynamoDB
                            
                                Difference between Zookeeper and a managed replicated database service
                            
                                DynamoDB The security token included in the request is invalid UnrecognizedClientException
                            
                                GraphQL AWS AMplify @connection not bringing in connected data
                            
                                Issues with dynamodb query with KeyConditionExpression
                            
                                Is it possible to setup dynamoDb permission scope to only table with some prefix?
                            
                                Where should I store my secret keys for my Node.js app?
                            
                                DynamoDb batchGetItem and Partition Key and Sort Key
                            
                                Spark 2.2.0 - How to write/read DataFrame to DynamoDB
                            
                                Better method for querying DynamoDB table randomly?
                            
                                DynamoDB SCAN operation cost with "Limit" parameter
                            
                                How to automatically create DynamoDB table on application startup using Spring Data DynamoDB?
                            
                                How to autogenerate global ID using AppSync + DynamoDB
                            
                                Getting only one item or the first item of the table in Dynamo DB
                            
                                AWS DynamoDB VS Couchdb, which would be better to use when?
                            
                                Django DynamoDB Database backend
                            
                                Merkle Tree Data Synchronization False Positives
                            
                                Delete rows by id and range condition?
                            
                                How do I integrate Amazon SQS with Dynamodb

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to retrieve a row's position within a DynamoDB global secondary index and the total?

Tags:

amazon-dynamodb

leaderboard