Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get distinct count on dynamodb on billion objects?

What is the most efficient way to get a number of how many distinct objects is stored in mine dynamodb?

Such as my objects have ten properties and I want to get a distinct count based on 3 properties.

like image 523
Rıfat Erdem Sahin Avatar asked Apr 09 '13 03:04

Rıfat Erdem Sahin


People also ask

Can DynamoDB handle millions of records?

Amazon DynamoDB is a NoSQL database that supports key-value and document data models. Developers can use DynamoDB to build modern, serverless applications that can start small and scale globally to support petabytes of data and tens of millions of read and write requests per second.

How do you count items in DynamoDB table?

To get the item count of a dynamodb table, you have to: Open the AWS Dynamodb console and click on your table's name. In the Overview tab scroll down to the Items summary section. Click on the Get live item count button.

What is the maximum item collection size in DynamoDB?

If an item collection exceeds the 10 GB limit, DynamoDB returns an ItemCollectionSizeLimitExceededException , and you won't be able to add more items to the item collection or increase the sizes of items that are in the item collection.

What is maximum limit for the size of an item collection in DynamoDB choose an answer from the options below?

Item size for tables with Local Secondary Indexes For each local secondary index on a table, there is a 400 KB limit on the total of the following: The size of an item's data in the table. The size of corresponding entries (including key values and projected attributes) in all local secondary indexes.


1 Answers

In case you need counters it's better to use the AtomicCounters (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithDDItems.html). In your case, DynamoDB doesn't support out of the box keys composed out of 3 attributes, unless you concatenate them, so the option would be to create a redundant table where the key is the concatenation of those 3 attributes and each you manage those objects, also update the AtomicCounter (add, delete, update - not needed actually).

Then you just query the counter, avoiding scans. So, it's space complexity to gain speed of retrieving data.

like image 62
tavi Avatar answered Oct 29 '22 01:10

tavi