Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advanced queries in HBase

Given the following HBase schema scenario (from the official FAQ)...

How would you design an Hbase table for many-to-many association between two entities, for example Student and Course?

I would define two tables:

Student: student id student data (name, address, ...) courses (use course ids as column qualifiers here)

Course: course id course data (name, syllabus, ...) students (use student ids as column qualifiers here)

This schema gives you fast access to the queries, show all classes for a student (student table, courses family), or all students for a class (courses table, students family).

How would you satisfy the request: "Give me all the students that share at least two courses in common"? Can you build a "query" in HBase that will return that set, or do you have to retrieve all the pertinent data and crunch it yourself in code?

like image 318
Teflon Ted Avatar asked Sep 16 '09 23:09

Teflon Ted


1 Answers

The query as described is better suited to a relational database. You can answer the query quickly, however, by precomputing the result. For example, you might have a table where the key is the number of classes in common, and the cells are individual students that have key-many classes in common.

You could use a variant on this to answer questions like "which students are in class X and class Y": use the classes as pieces of the key (in alphabetical ordering, or something at least consistent), and again, each column is a student.

like image 168
jonathan-stafford Avatar answered Sep 23 '22 02:09

jonathan-stafford