Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

designing database to hold different metadata information

So I am trying to design a database that will allow me to connect one product with multiple categories. This part I have figured. But what I am not able to resolve is the issue of holding different type of product details.

For example, the product could be a book (in which case i would need metadata that refers to that book like isbn, author etc) or it could be a business listing (which has different metadata) ..

How should I tackle that?

like image 782
Obaid Avatar asked Jul 13 '10 20:07

Obaid


People also ask

What is metadata in database design?

Metadata is data about data (see those examples to better understand this concept). Data in relational databases is stored in structured manner, organized in tables and columns and extended with constraints on the data - primary and unique constraints, foreign keys, check constraints or data types.

How is metadata stored in database?

Metadata can be stored in a variety of places. Where the metadata relates to databases, the data is often stored in tables and fields within the database. Sometimes the metadata exists in a specialist document or database designed to store such data, called a data dictionary or metadata repository.

Why metadata is important in a database design?

Metadata ensure that data are FAIR: Findable, Accessible, Interoperable and Re-usable. Findable: Metadata make it much easier to find relevant data. Most searches are done using text (like a Google search), so formats like audio, images, and video are limited unless text metadata is available.


Video Answer


2 Answers

This is called the Observation Pattern.

enter image description here

Three objects, for the example

Book Title = 'Gone with the Wind'  Author = 'Margaret Mitchell' ISBN   = '978-1416548898'  Cat Name = 'Phoebe' Color = 'Gray' TailLength = 9 'inch'  Beer Bottle Volume = 500 'ml' Color = 'Green' 

This is how tables may look like:

Entity EntityID    Name            Description    1        'Book'            'To read'    2        'Cat'             'Fury cat'     3        'Beer Bottle'     'To ship beer in' 

.

PropertyType PropertyTypeID   Name        IsTrait         Description    1            'Height'     'NO'       'For anything that has height'     2            'Width'      'NO'       'For anything that has width'     3            'Volume'     'NO'       'For things that can have volume'    4            'Title'      'YES'      'Some stuff has title'     5            'Author'     'YES'      'Things can be authored'     6            'Color'      'YES'      'Color of things'     7            'ISBN'       'YES'      'Books would need this'    8            'TailLength' 'NO'       'For stuff that has long tails'    9            'Name'       'YES'      'Name of things' 

.

Property PropertyID   EntityID  PropertyTypeID           1           1              4     -- book, title     2           1              5     -- book, author     3           1              7     -- book, isbn     4           2              9     -- cat, name     5           2              6     -- cat, color     6           2              8     -- cat, tail length     7           3              3     -- beer bottle, volume     8           3              6     -- beer bottle, color 

.

Measurement PropertyID     Unit       Value      6          'inch'       9          -- cat, tail length     7          'ml'        500         -- beer bottle, volume 

.

Trait PropertyID         Value      1         'Gone with the Wind'     -- book, title     2         'Margaret Mitchell'      -- book, author     3         '978-1416548898'         -- book, isbn     4         'Phoebe'                 -- cat, name     5         'Gray'                   -- cat, color     8         'Green'                  -- beer bottle, color 

EDIT:

Jefferey raised a valid point (see comment), so I'll expand the answer.

The model allows for dynamic (on-fly) creation of any number of entites with any type of properties without schema changes. Hovewer, this flexibility has a price -- storing and searching is slower and more complex than in a usual table design.

Time for an example, but first, to make things easier, I'll flatten the model into a view.

create view vModel as  select        e.EntityId     , x.Name  as PropertyName     , m.Value as MeasurementValue     , m.Unit     , t.Value as TraitValue from Entity           as e join Property         as p on p.EntityID       = p.EntityID join PropertyType     as x on x.PropertyTypeId = p.PropertyTypeId left join Measurement as m on m.PropertyId     = p.PropertyId left join Trait       as t on t.PropertyId     = p.PropertyId ; 

To use Jefferey's example from the comment

with  q_00 as ( -- all books     select EntityID     from vModel     where PropertyName = 'object type'       and TraitValue   = 'book'  ), q_01 as ( -- all US books     select EntityID     from vModel as a     join q_00   as b on b.EntityID = a.EntityID     where PropertyName = 'publisher country'       and TraitValue   = 'US'  ), q_02 as ( -- all US books published in 2008     select EntityID     from vModel as a     join q_01   as b on b.EntityID = a.EntityID     where PropertyName     = 'year published'       and MeasurementValue = 2008  ), q_03 as ( -- all US books published in 2008 not discontinued     select EntityID     from vModel as a     join q_02   as b on b.EntityID = a.EntityID     where PropertyName = 'is discontinued'       and TraitValue   = 'no'  ), q_04 as ( -- all US books published in 2008 not discontinued that cost less than $50     select EntityID     from vModel as a     join q_03   as b on b.EntityID = a.EntityID     where PropertyName     = 'price'       and MeasurementValue < 50        and MeasurementUnit  = 'USD' ) select       EntityID     , max(case PropertyName when 'title' than TraitValue else null end) as Title     , max(case PropertyName when 'ISBN'  than TraitValue else null end) as ISBN from vModel as a join q_04   as b on b.EntityID = a.EntityID group by EntityID ; 

This looks complicated to write, but on a closer inspection you may notice a pattern in CTEs.

Now suppose we have a standard fixed schema design where each object property has its own column. The query would look something like:

select EntityID, Title, ISBN from vModel WHERE ObjectType       = 'book'   and PublisherCountry = 'US'   and YearPublished    = 2008   and IsDiscontinued   = 'no'   and Price            < 50   and Currency         = 'USD' ; 
like image 99
Damir Sudarevic Avatar answered Sep 28 '22 06:09

Damir Sudarevic


I wasn't going to answer, but right now the accepted answer has a very bad idea. A relational database should never be used to store simple attribute-value pairs. That will cause a lot of problems down the road.

The best way to deal with this is to create a separate table for each type.

Product ------- ProductId Description Price (other attributes common to all products)  Book ---- ProductId (foreign key to Product.ProductId) ISBN Author (other attributes related to books)  Electronics ----------- ProductId (foreign key to Product.ProductId) BatteriesRequired etc. 

Each row of each table should represent a proposition about the real world, and the structure of the tables and their constraints should reflect the realities that are being represented. The closer you can get to this ideal, the cleaner the data will be, and the easier it will be to do reporting and to extend the system in other ways. It will also run more effeciently.

like image 30
Jeffrey L Whitledge Avatar answered Sep 28 '22 06:09

Jeffrey L Whitledge