I'm working on a project that is using Psycopg2 to connect to a Postgre database in python, and in the project there is a place where all the modifications to the database are being committed after performing a certain amount of insertions.
So i have a basic question: Is there any benefit to committing in smaller chunks, or is it the same as waiting to commit until the end?
for example, say im going to insert 7,000 rows of data, should i commit after inserting 1,000 or just wait until all the data is added?
If there is problems with large commits what are they? could i possibly crash the database or something? or cause some sort of overflow?
Unlike some other database systems, there is no problem with modifying arbitrarily many rows in a single transaction.
Usually it is better to do everything in a single transaction, so that the whole thing succeeds or fails, but there are two considerations:
If the transaction takes a lot of time, it will keep VACUUM from doing its job on the rest of the database for the duration of the transaction. That may cause table bloat if there is a lot of concurrent write activity.
It may be useful to do the operation in batches if you expect many failures and don't want to restart from the beginning every time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With