I have a list and that list increasing continuously. I am doing add batch depend on the list size. I forgot to put limit for do executeBatch in specified size.
Program is working for hours. I dont want to stop, fix and start again for now.
My questions, what decides size of the adding batch? What is the max capacity of the batch to do executeBatch()
in a one time? How many time I can use addBatch
without do executeBatch()
?
The batch size can be anything but needs to be decided carefully. JDBC specification supports up to 100 but individual databases e.g. Oracle, MySQL, Sybase, or SQL Server has their own limit on maximum batch size,, normal jdbc batch size ranges from 50 to 100.
The batch size range is 10,000 to 100,000 for 1 to 1, 1 to Many, and Many to 1 rules. The default value is 30,000.
Batch Processing allows you to group related SQL statements into a batch and submit them with one call to the database. When you send several SQL statements to the database at once, you reduce the amount of communication overhead, thereby improving performance. JDBC drivers are not required to support this feature.
A JDBC batch update is a batch of updates grouped together, and sent to the database in one batch, rather than sending the updates one by one. Sending a batch of updates to the database in one go, is faster than sending them one by one, waiting for each one to finish.
PgJDBC has some limitations regarding batches:
All request values, and all results, must be accumulated in memory. This includes large blob/clob results. So free memory is the main limiting factor for batch size.
Until PgJDBC 9.4 (not yet released), batches that return generated keys always do a round trip for every entry, so they're no better than individual statement executions.
Even in 9.4, batches that return generated keys only offer a benefit if the generated values are size limited. A single text
, bytea
or unconstrained varchar
field in the requested result will force the driver to do a round trip for every execution.
The benefit of batching is a reduction in network round trips. So there's much less point if your DB is local to your app server. There's a diminishing return with increasing batch size, because the total time taken in network waits falls off quickly, so it's often not work stressing about trying to make batches as big as possible.
If you're bulk-loading data, seriously consider using the COPY
API instead, via PgJDBC's CopyManager
, obtained via the PgConnection
interface. It lets you stream CSV-like data to the server for rapid bulk-loading with very few client/server round trips. Unfortunately, it's remarkably under-documented - it doesn't appear in the main PgJDBC docs at all, only in the API docs.
AFAIK there is no limit beside the memory issue. regarding your question: the statement is sent to the DB only on execute batch so until you'll execute the batch the memory will continue to grow until you will get JavaHeapSpace or the batch will be sent to the DB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With