Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is multi-value field a bad idea in relational databases

Having been working with Mongodb and Solr/Lucene, I am starting to wonder why multi-value field for relational databases are (generally) considered an bad idea?

I am aware of the theoretical foundation of relational database and normalization. In practice, however, I ran into many use cases where I end up using an meta table of key-value pairs to supplement the main table, such as in the cases of tagging, where I wish I don't have to make multiple joins to look up the data. Or where requirements suddenly changed from having to support an single author to multiple authors per article.

So, what are some disadvantages of having multi-value fields or did the vendor choose not to support it since it not part of the SQL standard?

like image 818
ltfishie Avatar asked Nov 26 '11 23:11

ltfishie


People also ask

What are multi-valued attributes What are the problems caused by them?

Multivalued attributes can cause problems with the meaning of data in the database, significantly slow down searching, and place unnecessary restrictions on the amount of data that can be stored. Assume, for example, that you have an Employee entity, with attributes for the name and birthdates of dependents.

What are the main problems with relational databases?

The relational data model doesn't fit in with every domain. Difficult schema evolution due to an inflexible data model. Weak distributed availability due to poor horizontal scalability. Performance hit due to joins, ACID transactions and strict consistency constraints (especially in distributed environments).

What is a multivalued field in a database?

A multivalued field (MVF) allows for the storage of more than one value in a database field. MVFs are somewhat controversial, with many arguing that they violate one of the very sacred tenets of database design as laid out by E.F.

Why are relational databases bad for big data?

RDBMS lacks in high velocity because it's designed for steady data retention rather than rapid growth. Even if RDBMS is used to handle and store “big data,” it will turn out to be very expensive. As a result, the inability of relational databases to handle “big data” led to the emergence of new technologies.


2 Answers

The main disadvantage is query bias. The phenomenon that such databases tend to get designed with one particular kind of query in mind, and turn out to be difficult to handle when other queries need to be written.

Suppose you have Students and Courses, and you model all of that so that you can say, in a single row in a single table, "John Doe takes {French, Algebra, Relational Theory}" and "Jane Doe takes {German, Functional Computing, Relational Theory}".

That makes it easy to query "what are all the courses followed by ...", but try and imagine what it would take to produce the answer to "what are all the students who follow Relational Theory".

Try and imagine all the things the system should itself be doing to give such a query (if it were possible to write it) any chance of performing reasonably ...

like image 167
Erwin Smout Avatar answered Sep 29 '22 06:09

Erwin Smout


The query bias is assuming that SQL is a always a good query language. The fact is it is sometimes an excellent query language, but it has never been one size fits all. Multivalue databases allow you to pack multiple values and handle 'alternate perspective' queries. Examples of MVDBs: UniData http://u2.rocketsoftware.com/products/u2-unidata, OpenInsight http://www.revelation.com/, Reality http://www.northgate-is.com/. There are many others. Their query languages support what you are looking to do.

like image 30
Charles Barouch Avatar answered Sep 29 '22 07:09

Charles Barouch