Given the following HBase schema scenario (from the official FAQ)...
How would you design an Hbase table for many-to-many association between two entities, for example Student and Course?
I would define two tables:
Student: student id student data (name, address, ...) courses (use course ids as column qualifiers here)
Course: course id course data (name, syllabus, ...) students (use student ids as column qualifiers here)
This schema gives you fast access to the queries, show all classes for a student (student table, courses family), or all students for a class (courses table, students family).
How would you satisfy the request: "Give me all the students that share at least two courses in common"? Can you build a "query" in HBase that will return that set, or do you have to retrieve all the pertinent data and crunch it yourself in code?
The query as described is better suited to a relational database. You can answer the query quickly, however, by precomputing the result. For example, you might have a table where the key is the number of classes in common, and the cells are individual students that have key-many classes in common.
You could use a variant on this to answer questions like "which students are in class X and class Y": use the classes as pieces of the key (in alphabetical ordering, or something at least consistent), and again, each column is a student.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With