Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aerospike data modeling and querying

Lets say I have the following model in JAVA

class Shape {
    String type;
    String color;
    String size;
}

And say I have the following data based on the model above.

Triangle, Blue, Small
Triangle, Red, Large
Circle, Blue, Small
Circle, Blue, Medium
Square, Green, Medium
Star, Blue, Large

I would like to answer the following questions

Given the type Circle how many unique colors?
    Answer: 1
Given the type Circle how many unique sizes?
    Answer: 2

Given the color Blue how many unique shapes?
    Answer: 2
Given the color Blue how many unique sizes?
    Answer: 3

Given the size Small how many unique shapes?
    Answer: 2
Given the size Small how many unique colors?
    Answer: 1

I'm wondering if I should model it the following way...

set: shapes -> key: type -> bin(s): list of colors, list of sizes
set: colors -> key: color -> bin(s): list of shapes, list of sizes
set: sizes -> key: size -> bin(s): list of shapes, list of colors

Or is there a better way to do this? If I do this way I need 3 times more the storage.

I also expect to have billions of entries for each set. Btw the model has been redacted to protect the inoncent code ;)

like image 813
user432024 Avatar asked Mar 15 '23 14:03

user432024


1 Answers

Data modeling in NoSQL is always about how you plan to retrieve the data, at what throughput and at what latency.

There are several ways to model this data; the simplest is to mimic the class structure where each field becomes a Bin. You could define Secondary Indexes on each bin and use Aggregation Queries to answer your questions (above).

But this is only one way; you may need to satisfy the factors of latency and throughput with a different data model.

like image 108
Mnemaudsyne Avatar answered Apr 06 '23 09:04

Mnemaudsyne