Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

database design - categories and sub-categories [closed]

Tags:

I need to implement Categorization and Sub-Categorization on something which is a bit similar to golden pages.

Assume I have the following table:

Category Table

CategoryId, Title
10, Home
20, Business
30, Hobbies

I have two options to code the sub-categorization.

OPTION 1 - Subcategory Id is unique within Category ONLY:

Sub Category Table

CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 100, Development
20, 110, Marketing
20, 120, ...
30, 100, Soccer
30, 110, Reading
30, 120, ...

OPTION 2 - Subcategory Id is unique OVERALL:

Sub Category Table

CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 130, Development
20, 140, Marketing
20, 150, ...
30, 160, Soccer
30, 170, Reading
30, 180, ...

Option 2 sounds like it is easier to fetch rows from table For example: SELECT BizTitle FROM tblBiz WHERE SubCatId = 170

whereas using Option 1 I'd have to write something like this:

SELECT BizTitle FROM tblBiz WHERE CatId = 30 AND SubCatId = 170

i.e., containing an extra AND

However, Option 1 is easier to maintain manually (when I need to update and insert new subcategories etc. and it is more pleasant to the eye in my opinion.

Any thoughts about it? Does Option 2 worth the trouble in terms of efficiency? Is there any design patters related with this common issue?

like image 248
dsb Avatar asked Jul 25 '15 06:07

dsb


People also ask

What is category and sub category?

: a category that is a subdivision of a larger category : a secondary category grouping the books into the appropriate categories and subcategories A new subcategory of vodkas, which provide a contrast to the "tasteless" aspect of this spirit, are the flavored vodkas …—

What is a category in a database?

Data categories are tables of data, which are organized by rows and columns. Columns are also known as data fields. A row of data has entries for one or more columns in the category. When you add a data field onto a report you are seeing the information in one column of data for every row in the category.


2 Answers

I would use this structure:

ParentId, CategoryId, Title
null, 1, Home
null, 2, Business
null, 3, Hobbies
1, 4, Gardening
1, 5, Kitchen
1, 6, ...
2, 7, Development
2, 8, Marketing
2, 9, ...
3, 10, Soccer
3, 11, Reading
3, 12, ...

In detail:

  • only use one table, which references itself, so that you can have unlimited depth of categories
  • use technical ids (using IDENTITY, or similar), so that you can have more than 10 subcategories
  • if required add a human readable column for category-numbers as separate field

As long as you are only using two levels of categories you can still select like this:

SELECT BizTitle FROM tblBiz WHERE ParentId = 3 AND CategoryId = 11

The new hierarchyid feature of SQL server also looks quite promising: https://msdn.microsoft.com/en-us/library/bb677173.aspx


What I don't like about the Nested Set Model:

  • Inserting and deleting items in the Nested Set Model is a quite comlicated thing and requires expensive locks.
  • One can easily create inconsistencies which is prohibited, if you use the parent field in combination with a foreign key constraint.
    • Inconsistencies can appear, if rght is lower than lft
    • Inconsistencies can appear, if a value apprears in several rght or lft fields
    • Inconsistencies can appear, if you create gaps
    • Inconsistencies can appear, if you create overlaps
  • The Nested Set Model is in my opinion more complex and therefore not as easy to understand. This is absolutely subjective, of course.
  • The Nested Set Model requires two fields, instead of one - and so uses more disk space.
like image 140
slartidan Avatar answered Sep 17 '22 13:09

slartidan


Managing hierarchical data has some ways. One of the most important ones is Nested Set Model. See here for implementation. Even some content management systems like Joomla, use this structure.

Update 2020: As there are some considerations on this post, I should say that now I prefer the Adjacency List Model instead of the Nested Set Model since there is less complexity in this way. Also See here for implementation.

like image 25
Vahid Najafi Avatar answered Sep 21 '22 13:09

Vahid Najafi