Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polymorphic relationships vs separate tables per type

I am working on a database which has some types (e.g. User, Appointment, Task etc.) which can have zero or more Notes associated with each type.

The possible solutions I have come across for implementing these relationships are:

  1. Polymorphic relationship
  2. Separate table per type

Polymorphic Relationship

Suggested by many as being the easiest solution to implement and seemingly the most common implementation for frameworks that follow the Active Record pattern, I would add a table whose data is morphable:

notable

My notable_type would allow me to distinguish between the type (User, Appointment, Task) the Note relates to, whilst the notable_id would allow me to obtain the individual type record in the related type table.

PROS:

  • Easy to scale, more models can be easily associated with the polymorphic class
  • Limits table bloat
  • Results in one class that can be used by many other classes (DRY)

CONS

  • More types can make querying more difficult and expensive as the data grows
  • Cannot have a foreign key
  • Lack of data consistency

Separate Table per Type

Alternatively I could create a table for each type which is responsible for the Notes associated with that type only. The type_id foreign key would allow me to quickly obtain the individual type record.

user notes

Deemed by many online as a code smell, many articles advocate avoiding the polymorphic relationship in favour of an alternative (here and here for example).

PROS:

  • Allows us to use foreign keys effectively
  • Efficient data querying
  • Maintains data consistency

CONS:

  • Increases table bloat as each type requires a separate table
  • Results in multiple classes, each representing the separate type_notes table

Thoughts

The polymorphic relationship is certainly the simpler of the two options to implement, but the lack of foreign key constraints and therefore potential for consistency issues feels wrong.

A table per notes relationship (user_notes, task_notes etc.) with foreign keys seems the correct way (in keeping with design patterns) but could result in a lot of tables (addition of other types that can have notes or addition of types similar to notes [e.g. events]).

It feels like my choice is either simplified table structure but forgo foreign keys and increased query overhead, or increase the number of tables with the same structure but simplify queries and allow for foreign keys.

Given my scenario which of the above would be more appropriate, or is there an alternative I should consider?

like image 926
Peppermintology Avatar asked Jun 23 '19 19:06

Peppermintology


People also ask

Is polymorphism is many to many relationship?

Polymorphic relationship:A polymorphic relationship is used when you want something like Many to Many relationship, but without having to create extra tables every time you want to add a new Model to the mix. Polymorphic helps you combine all the junction tables into 1 very slim table, but at a cost.

What kind of relationship is polymorphism?

A polymorphic relationship is where a model can belong to more than one other model on a single association. To clarify this, let's create an imaginary situation where we have a Topic and a Post model. Users can leave comments on both topics and posts.

What is polymorphic database?

Polymorphic association is a term used in discussions of Object-Relational Mapping with respect to the problem of representing in the relational database domain, a relationship from one class to multiple classes. In statically typed languages such as Java these multiple classes are subclasses of the same superclass.

What is polymorphism in SQL?

"polymorphism" means multiple shapes (multiple subprograms, same name). Overloading is static polymorphism because the COMPILER resolves which of the subprograms to execute (at compile time). Dynamic polymorphism means we have 2+ methods with the same name, but in different types in the same hierarchy.


1 Answers

What is "table bloat"? Are you concerned about having too many tables? Many real-world databases I've worked on have between 100 and 200 tables, because that's what it takes.

If you're concerned with adding multiple tables, then why do you have separate tables for User, Appointment, and Task? If you had a multi-valued attribute for User, for example for multiple phone numbers per user, would you create a separate table for phones, or would you try to combine them all into the user table somehow? Or have a polymorphic "things that belong to other things" table for user phones, appointment invitees, and task milestones?

Answer: No, you'd create a Phone table, and use it to reference only the User table. If Appointments have invitees, that gets its own table (probably a many-to-many between appointments and users). If tasks have milestones, that gets its own table too.

The correct thing to do is to model your database tables like you would model object types in your application. You might like to read a book like SQL and Relational Theory: How to Write Accurate SQL Code 3rd Edition by C. J. Date to learn more about how tables are analogous to types.

You already know instinctively that the fact that you can't create a foreign key is a red flag. A foreign key must reference exactly one parent table. This should be a clue that it's not valid relational database design to make a polymorphic foreign key. Once you start thinking of tables and their attributes as concrete types (like described in SQL and Relational Theory), this will become obvious.

If you must create one notes table, you could make it reference one table called "Notable" which is like a superclass of User, Appointment, and Task. Then each of those three tables would also reference a primary key of Notable. This mimics the object-oriented structure of polymorphism, where you can have a class Note have a reference to an object by its superclass type.

But IMHO, that's more complex than it needs to be. I would just create separate tables for UserNotes, AppointmentNotes, and TaskNotes. I'm not troubled by having three more tables, and it makes your code more clear and maintainable.

like image 85
Bill Karwin Avatar answered Oct 30 '22 06:10

Bill Karwin