Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database design - should I use 30 columns or 1 column with all data in form of JSON/XML?

I am doing a project which need to store 30 distinct fields for a business logic which later will be used to generate report for each

The 30 distinct fields are not written at one time, the business logic has so many transactions, it's gonna be like:

Transaction 1, update field 1-4
Transaction 2, update field 3,5,9
Transaction 3, update field 8,12, 20-30
...
...

N.B each transaction(all belong to one business logic) would be updating arbitrary number of fields & not in any particular order.

I am wondering what's my database design would be best:

  1. Have 30 columns in postgres database representing those 30 distinct field.

  2. Have 30 filed store in form of xml or json and store it in just one column of postgres.

1 or 2 which one is better ?

If I choose 1>:

I know for programming perspective is easier Because in this way I don't need to read the overall xml/json and update only a few fields then write back to database, I can only update a few columns I need for each transaction.

If I choose 2>:

I can potentially generic reuse the table for something else since what's inside the blob column is only xml. But is it wrong to use the a table generic to store something totally irrelevant in business logic just because it has a blob column storing xml? This does have the potential to save the effort of creating a few new table. But is this kind of generic idea of reuse a table is wrong in a RDBMS ?

Also by choosing 2> it seem I would be able to handle potential change like change certain field /add more field ? At least it seems I don't need to change database table. But I still need to change c++ & c# code to handle the change internally , not sure if this is any advantage.

I am not experiences enough in database design, so I cannot make the decision which one to choose. Any input is appreciated.

N.B there is a good chance I probabaly don't need to do index or search on those 30 columsn for now, a primary key will be created on a extra column is I choose 2>. But I am not sure if later I will be required to do search based on any of those columns/field.

Basically all my fields are predefined from requirement documents, they generally like simple field:

field1: value(max len 10)
field2: value(max len 20)
...
field20: value((max len 2)

No nest fields. Is it worth to create 20 columns for each of those fields(some are string like date/time, some are string, some are integer etc).

2> Is putting different business logic in a shared table a bad design idea? If it only being put in a shared table because they share the same structure? E.g. They all have Date time column , a primary key & a xml column with different business logic inside ? This way we safe some effort of creating new tables... Is this saving effort worth doing ?

like image 388
RoundPi Avatar asked Jan 16 '23 02:01

RoundPi


2 Answers

Always store your XML/JSON fields as separate fields in a relational database. Doing so you will keep your database normalized, allowing the database to do its thing with queries/indices etc. And you will save other developers the headache of deciphering your XML/JSON field.

It will be more work up front to extract the fields from the XML/JSON and perhaps to maintain it if fields need to be added, but once you create a class or classes to do so that hurdle will be eliminated and it will more than make up for the cryptic blob field.

like image 153
Norma L. Ized Avatar answered Jan 30 '23 23:01

Norma L. Ized


In general it's wise to split the JSON or XML document out and store it as individual columns. This gives you the ability to set up constraints on the columns for validation and checking, to index columns, to use appropriate data types for each field, and generally use the power of the database.

Mapping it to/from objects isn't generally too hard, as there are numerous tools for this. For example, Java offers JAXB and JPA.

The main time when splitting it out isn't such a great idea is when you don't know in advance what the fields of the JSON or XML document will be or how many of them there will be. In this case you really only have two choices - to use an EAV-like data model, or store the document directly as a database field.

In this case (and this case only) I would consider storing the document in the database directly. PostgreSQL's SQL/XML support means you can still create expression indexes on xpath expressions, and you can use triggers for some validation.

This isn't a good option, it's just that EAV is usually an even worse option.

If the document is "flat" - ie a single level of keys and values, with no nesting - the consider storing it as hstore instead, as the hstore data type is a lot more powerful.

like image 20
Craig Ringer Avatar answered Jan 30 '23 21:01

Craig Ringer