I'm very new to databases and I'm a novice to data abstraction, coming from Java. To teach myself, I'm working on an online app that will, among other things, allow users to be part of multiple groups.
Sketching out the database, it seems I'll have to have something like a "Membership" table:
UserID|GroupID
------|-------
1 | 1
1 | 2
2 | 1
2 | 3
2 | 5
I feel a little wary of this, since it's only two foreign keys, and only serves to link two objects. Is this standard practice for this kind of relationship? If not, what is the preferred method?
Again, I'm very new to databases. My book doesn't mention this kind of situation, so if there's some keyword that reflects this function that I overlooked...
Thank you.
That's a standard way for representing many-to-many relationship and is known as "junction table" (or "link table").
You already noted that both UserID and GroupID are foreign keys that reference other tables. But when it comes to keys (not foreign keys), you have several options:
{UserID, GroupID}
. Beside ensuring same user cannot be connected to same group multiple times, it also facilitates the efficient search for groups of the given user. Since UserID is at the leading edge of the index (that the DBMS automatically creates under the key), all GroupID values associated to the same UserID are in a continuous range within the index B-tree, so getting groups of the given user can be done by the DBMS through a simple index range scan.{GroupID, UserID}
. Same fields, opposite order. This facilitates quickly getting users of the given group (i.e. querying in the opposite "direction" compared to (1)).{UserID, GroupID}
and (unique) index on {GroupID, UserID}
(or vice-versa). This is useful if you need to query in both directions: getting groups of given user and getting users of given group, respectively.{UserGroupID}
). This may be useful if you have "child" tables that reference the junction table, and you want to streamline the size of the key that is being migrated to them through foreign keys. It may also be useful if your ORM tool doesn't work well with composite keys.If you decide for options (1) or (2), cluster the table (if your DBMS supports it). Since you are only doing index range scans anyway, there is no need for the table heap to exist at all. You should even consider clustering for (3), since both indexes are covering so there is no danger of double-lookup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With