Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL slow on a large table with arrays and lots of updates

I have a pretty large table (20M records) which has a 3 column index and an array column. The array column is updated daily (by appending new values) for all rows. There is also inserts, but not as much as there are updates.

The data in the array represents daily measurements corresponding to the three keys, something like this: [[date_id_1, my_value_for_date_1], [date_id_2, my_value_for_date_2]]. It is used to draw a graph of those daily values. Say I want to visualize the value for the key (a, b, c) over time, I do SELECT values FROM t WHERE a = my_a AND b = my_b AND c = my_c. Then I use the values array to draw the graph.

Performance of the updates (which happen in a bulk once a day) has worsened considerably over time.

Using PostgreSQL 8.3.8.

Can you give me any hints of where to look for a solution? It could be anything from tweaking some parameters in postgres to even moving to another database (I guess a non-relational database would be better suited for this particular table, but I don't have much experience with those).

like image 698
ibz Avatar asked Jun 23 '10 08:06

ibz


1 Answers

I would take a look at the FILLFACTOR for the table. By default it's set to 100, you could lower it to 70 (to start with). After this, you have to do a VACUUM FULL to rebuild the table.

ALTER TABLE tablename SET (FILLFACTOR = 70);
VACUUM FULL tablename;
REINDEX TABLE tablename;

This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. Or if your database is already somewhat fragmented from lots of previous updated, it might already be sparese enough. Now your database also has the option to do HOT updates, assuming the column you are updating is not one involved in any index.

like image 81
Frank Heikens Avatar answered Oct 14 '22 07:10

Frank Heikens