I'm trying to understand some concepts of DB desing.
I have three tables:
Movies (id,title)
1 - The godfather
2 - Matrix
Attribute (id,name)
1 - Country
2 - Type
Attribute Value(attribute_id,id,value)
1,1,USA
1,2,Japan
2,1,Thriller
2,2,Comedy
and I would like to link movies with one and just one attribute and with one attribute value
IE: Godfather, Country: USA, Type: Crime
I'm trying to find out which of the next is the best solution for linking attributes to a movie. I can see 4 diferent options:
Schema A
Problem I see is that I can't restrict multiple attribute_values of same attribute for a movie. I.E ("godfather","USA","JAPAN") is a valid statement The restriction should be controlled by the application
Schema B
It's almost the same as Schema A but making Attribute Value a weak entity. I think this has no effect at database level but it would make a bit harder to fetch attribute values as you need attribute key as well. This schema allows to have same category repated, with diferent values, multiple times, so I don't think is a good option neither. As well as option A, the restriction should be controlled by the application
("godfather","Country:USA","Country:JAPAN") is a valid statement
Schema C
I think this is the correct one as now we can’t add more than 1 attribute of the same type to a movie “Godfather”, “USA”, “JAPAN” is not a valid insertion!
But I can't tell if making attribute_value a weak entity would be correct or not, better or worse :S
Schema D
As I said, same as C but with composite key in attribute_value. I’m not sure if this break some database normalization rules. In case this is OK, which table should be referenced from movie_attribute_value for field attribute_id? Attribute ID from Attribute table or AttributeID from AttributeValue table? Is ok to have a composite foreing key and use only a part of the key in the PK ?
Could you please explain with option is better and why?
Thanks in advance!
EDIT
I understood the problems of such a design like this, what a EAV schema is and the needed to avoid this type of schema unless in a situation with a lot of changes in the attribute table. Unfortunately this is my scenario, the attributes of the movie are defined by users, so I don't have a way to know which attributes are going to be used. I have to read them and display to another users to fill them. I think Schema C is correct but would like to know what's the problem of using schemas A and B and letting developers to control the restrictions (one same type attribute per movie) in the code
As well would be great if somebody can explaing the benefits and pitfalls of using Schema D (composite K) instead of Schema C and if it's OK to have only some fields of a foreign key (attribute_value_id,attribute_id) as PK (movie_id, attribute_id)
As Marc_s comments, EAV designs have a whole bunch of drawbacks. In the case of a movie collection, you know the schema, and it's unlikely to change randomly, and when it does change (e.g. you need to add a flag "available in 4K"), it's probably a big deal.
Ask yourself how you will retrieve all films for a given genre, or all films available in both the US and Japan, or all comedies available in the US but not Japan - you'll very quickly see the limits of EAV.
To answer your question - none of your designs work for me - there are too many tables that don't earn their keep. If you really must go EAV, I'd suggest:
MOVIES
---------
MovieID
.....
ATTRIBUTES
--------------
AttributeID
AttributeName
MOVIE_ATTRIBUTES
------------
MovieID
AttributeID
Value
If you want to provide the list of valid value, the easiest way is to query the "movie attributes" table and retrieve previous entries for that combination of movie and attribute - keeping your schema simple will make life MUCH easier.
If you really want to put the values in a separate table, schema D appears correct.
Schema C says:
I believe the last statement is incorrect.
One approach would be to lump all the attributes in the one table, along with a defined type of attribute. Thus:
Movies
------
MovieId
AttributeTypes
---------------
AttributeTypeId
Description
Attributes
---------
AttributeId
AttributeTypeId
Description
MovieAttributes
---------------
MovieId
AttributeId
It could make for awkward queries, but that really depends on how the stored data will be used.
(In other words, yes, I agree with prior posts, and recommend avoiding EAV structures.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With