Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it ever a good idea to store an array as a field value, or store array values as records?

In my application I've got "articles" (similar to posts/tweets/articles) that are tagged with descriptive predefined tags: i.e "difficult", "easy", "red", "blue", "business" etc

These available tags are stored in a table, call it "tags" that contains all available tags.

Each article can be tagged with multiple tags, editable through a custom admin interface.

It could be tempting to simply bundle the tags for each entity into a stringified array of the IDs of each tag and store it alongside the article record in my "articles" table:

id | title | author | tags
---+-------+--------+-------------
1  | title | TG     | "[1,4,7,12]"

though I'm sure this is a bad idea for a number of reasons, is there ever a reasonable reason to do the above?

like image 631
OliverJ90 Avatar asked Jan 25 '16 14:01

OliverJ90


1 Answers

I think you should read about Database normalization and decide for yourself. In short though, there are a number of issues with your proposal, but you may decide you can live with them.

The most obvious are:

  1. What if an additional tag is added to row(1)? Do you have to first parse, check if it's already present then update the row to be tags.append(newTag).
  2. Worse still deleting a tag? Search tags, is present, re-create tags.
  3. What if a tag is to change name - some moderation process, perhaps?
  4. Worse again, what about dfferent people specifying a tag-name differently - it'd be hard to rationalise.
  5. What if you want to query data based on tags? Your query becomes far more complex than it would need to be.
  6. Presentation: The client has to parse the tag in order to use it. What about the separator field? Change that and all clients have to change.

In short, all of these operations become harder and more cumbersome. Normalization is designed to overcome such issues. Probably the only reason for doing what you say, IMO, is that you're capturing the data as a one-off and it's informational only - that is, makes sense to a user but not to a system per-se. This is kind of like saying it's probably best avoided (again, IMO).

like image 170
wmorrison365 Avatar answered Sep 20 '22 15:09

wmorrison365