I have a table with a primary id column (automatically indexed), two sub-id columns (also indexed), and 12 columns of type geometry(Polygon). If I am going to update ~2 million rows on this table, is it faster for me to run ~2 million individual update statements like
update TABLE_NAME set ( COLUMNS ) = ( VALUES ) where ID_COLUMN = NEXT_ID
or is it faster to do some smaller number of larger update statements like in this answer
update TABLE_NAME as update_t set
COLUMNS = new_vals.COLUMNS
from (values
(id, polygon1val, polygon2val, ... polygon12val), /* row 1 */
(id, polygon1val, polygon2val, ... polygon12val), /* row 2 */
... /* ... */
(id, polygon1val, polygon2val, ... polygon12val) /* row N */
) as new_vals( COLUMNS )
where new_vals.id = update_t.id
If the latter, do you have any suggestions on what a good N
might be? Is N
= ~2mil, or some smaller subset (that I would repeat until they're all done)?
EDIT: Obviously, in the former case I would use a prepared statement. But I also wonder, in the latter case, is there any benefit in trying to use a prepared statement?
I'm using PostgreSQL 9.2.
3 Answers. Show activity on this post. The single UPDATE is faster. That is, multiple UPDATE turned out to be 5-6 times slower than single UPDATE .
One of my favorite ways of dealing with millions of records in a table is processing inserts, deletes, or updates in batches. Updating data in batches of 10,000 records at a time and using a transaction is a simple and efficient way of performing updates on millions of records.
In general, the better you can batch operations into sets the more options the database has of making things fast. If you run the updates individually, the only option is something like "locate the one row affected, delete it, insert new one"
If you can batch the updates then the planner gets to decide whether a sequential scan may be faster than a bunch of index scans (and it may well be, since you get to leverage read-ahead caching). In other words, one command updating a lot of rows almost always performs better than a lot of commands updating a single row, even aside from planning overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With