What is the best database schema to support values that are only appropriate to specific rows?

Tags:

database-design

I have a db table called Calendar with fields

Id (PK)
Name
Description
CalendarTypeId (FK into CalendarType table)

I have another table called CalendarType with fields

Id (PK)
Name
Description

The issue is that i need to store an additional field for every calendar where the calendar Type is 2. (but this field would be irrelevant for any other calendar type).

Should i just create a new field in the Calendar table and ignore that field for all other calendar that have a different calendarTypeid or is there a better way to organize this schema to support this need.

310

asked Dec 31 '11 01:12

leora

1 Answers

Ok, this is the ER model of what you currently have (omitting cardinalities):

Now, let's focus on the Calendar and SubCalendar. Clearly, you have a hierarchy there. But how are hierarchies turned into tables? There are three common ways to do this:

1) Kill the parent and keep the children: In this case you remove the parent entity and send all the fields from that entity to each of the children. In your example you only have one child, so all the parent's attributes will go only to it.

Advantages: No null values as each table will have all it needs. No joins are required either. If you will be running queries searching just for one type of children this schema will be useful because you won't need to filter by type because each table will store only one type

Disadvantages: This schema is not appropriate for cases where you have overlapping children. In other words, if a parent row can have more than one children when sending the fields to each child the parent data will be duplicated in each children. Not good, so don't use this strategy if that is the case. Additionally, if you have many children and very few records in each, you'll have many tables with few records each, so it might become a little harder to manage

2) Kill children and keep the parent: In this case you remove all the children and send all of their attributes to the parent. As the parent is now a mix of itself and all of its children it needs a way to determine what row belongs to what type of children. This is accomplished by adding a new attribute to the parent entity that will determine the type of each row (no matter the data type).

Advantages: There will only be one table for all children, so that is easy to manage. No joins are required. Might be useful if the most queries that are run against this table require results from more than one type of children.

Disadvantages: Again, if a parent can have a row that relates to more than one children data will be duplicated as there will be one row per each of them, so there is a limitation in this solution. Additionally, a new column will have to be added as metadata. The volume of records in the table will be the larger. Null values will have to be assigned to the data that children have and the parent nor other children have.

3) Keep all: The least bloody solution is not to kill anything :) In this case the hierarchy is replaced by a relationship between the parent and each of the children. That way, a child will have to join to the parent table by means of a foreign key to reach the parent's data.

Advantages: There is no data duplication nor null values. Each entity has just the minimal amount of data and the rest can be obtained by joining to the parent table. In this case, a parent row can be linked to more than one children without duplicating data. If there will be run many queries that can be satisfied with only one table (usually the parent) this is a good option. One more thing is that it is easy to extend to more calendars, for example, if a new calendar is to be added that requires new fields, then a new table has to be added, without modifying the current ones

Disadvantages: Requires the most tables (actually one more than the first one). A join is needed per each children which will degrade performance the larger the dataset becomes. Additionally, foreign keys will be needed to join both tables. If the most queries will require data from parent and children this schema will be the worst in terms of performance

Now, you asked which is the best database schema. I think it is now clear it depends on the requirements, types of queries that will be run, the way data is structured, etc.

However, I can analyze this a little bit more. You said that you had a Calendar table and sometimes more data is needed for one of them. So we can say we have 2 types of calendars, the parent and the child. So we might think that going for solution 2 is a good possibility because you'll have 2 rows representing each type, but we would be wrong. This is because each child includes its parent in this case. Now, if we can assume that if SubAttribute will always be non-null for a child and null for a parent we can even remove CalendarType, which will actually lead to solution 1.

Finally, as a rule of thumb (and mainly because most queries have lots of joins in real life), if you want to focus on performance, you should go for solution 1, otherwise, if you want to focus on having normalized design you should go for solution 3.

I hope this has cleared some doubts and possibly generated others :)

157

answered Oct 07 '22 11:10

Mosty Mostacho

Related questions
                            
                                Database per application VS One big database for all applications [closed]
                            
                                enums in SQL Server database
                            
                                What data type should I use for IETF language codes?
                            
                                Double Foreign key in postgresql
                            
                                Single Table Inheritance (Database Inheritance design options) pros and cons and in which case it used?
                            
                                natural key vs surrogate key an innodb foreign key
                            
                                What is the best default transaction isolation level for an ERP, if any?
                            
                                Building a Contact Database - Need a little schema inspiration
                            
                                SQL - Best practice for a Friendship table
                            
                                SQL Injection after removing all single-quotes and dash-characters
                            
                                Prevent insert if condition is met
                            
                                Why specify a length for character varying types
                            
                                Are there any reasons why I should/shouldn't use ObjectId's in my RESTful url's
                            
                                Database design - Should a Date be used as part of a primary key
                            
                                How to share a table between multiple Postgresql databases
                            
                                Designing Database Schema for Event-based Analytics
                            
                                Defining multiple foreign keys in one table to many tables
                            
                                Calendar tables in PostgreSQL 9
                            
                                CREATE OR REPLACE VIEW sql error
                            
                                Referencing Table & Referenced Table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With