Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql - Creating rows vs. columns performance

I built an analytics engine that pulls 50-100 rows of raw data from my database (lets call it raw_table), runs a bunch statistical measurements on it in PHP and then comes up with exactly 140 datapoints that I then need to store in another table (lets call it results_table). All of these data points are very small ints ("40","2.23","-1024" are good examples of the types of data).

I know the maximum # of columns for mysql is quite high (4000+) but there appears to be a lot of grey area as far as when performance really starts to degrade.

So a few questions here on best performance practices:

1) The 140 datapoints could be, if it is better, broken up into 20 rows of 7 data points all with the same 'experiment_id' if fewer columns is better. HOWEVER I would always need to pull ALL 20 rows (with 7 columns each, plus id, etc) so I wouldn't think this would be better performance than pulling 1 row of 140 columns. So the question: is it better to store 20 rows of 7-9 columns (that would all need to be pulled at once) or 1 row of 140-143 columns?

2) Given my data examples ("40","2.23","-1024" are good examples of what will be stored) I'm thinking smallint for the structure type. Any feedback there, performance-wise or otherwise?

3) Any other feedback on mysql performance issues or tips is welcome.

Thanks in advance for your input.

like image 705
themerlinproject Avatar asked Feb 14 '11 04:02

themerlinproject


2 Answers

I think the advantage to storing as more rows (i.e. normalized) depends on design and maintenance considerations in the face of change.

Also, if the 140 columns have the same meaning or if it differs per experiment - properly modeling the data according to normalization rules - i.e. how is data related to a candidate key.

As far as performance, if all the columns are used it makes very little difference. Sometimes a pivot/unpivot operation can be expensive over a large amount of data, but it makes little difference on a single key access pattern. Sometimes a pivot in the database can make your frontend code a lot simpler and backend code more flexible in the face of change.

If you have a lot of NULLs, it might be possible to eliminate rows in a normalized design and this would save space. I don't know if MySQL has support for a sparse table concept, which could come into play there.

like image 130
Cade Roux Avatar answered Sep 20 '22 05:09

Cade Roux


You have a 140 data items to return every time, each of type double.

It makes no practical difference whether this is 1x140 or 20x7 or 7x20 or 4x35 etc. It could be infinitesimally quicker for one shape of course but then have you considered the extra complexity in the PHP code to deal with a different shape.

Do you have a verified bottleneck, or is this just random premature optimisation?

like image 22
gbn Avatar answered Sep 20 '22 05:09

gbn