Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the max JDBC batch size?

I have a list and that list increasing continuously. I am doing add batch depend on the list size. I forgot to put limit for do executeBatch in specified size.

Program is working for hours. I dont want to stop, fix and start again for now.

My questions, what decides size of the adding batch? What is the max capacity of the batch to do executeBatch() in a one time? How many time I can use addBatch without do executeBatch()?

like image 362
vlyalcin Avatar asked Jan 02 '14 13:01

vlyalcin


People also ask

What is batch size in JDBC?

The batch size can be anything but needs to be decided carefully. JDBC specification supports up to 100 but individual databases e.g. Oracle, MySQL, Sybase, or SQL Server has their own limit on maximum batch size,, normal jdbc batch size ranges from 50 to 100.

What is Oracle batch size?

The batch size range is 10,000 to 100,000 for 1 to 1, 1 to Many, and Many to 1 rules. The default value is 30,000.

What is batching in JDBC?

Batch Processing allows you to group related SQL statements into a batch and submit them with one call to the database. When you send several SQL statements to the database at once, you reduce the amount of communication overhead, thereby improving performance. JDBC drivers are not required to support this feature.

What is JDBC batch update?

A JDBC batch update is a batch of updates grouped together, and sent to the database in one batch, rather than sending the updates one by one. Sending a batch of updates to the database in one go, is faster than sending them one by one, waiting for each one to finish.


2 Answers

PgJDBC has some limitations regarding batches:

  • All request values, and all results, must be accumulated in memory. This includes large blob/clob results. So free memory is the main limiting factor for batch size.

  • Until PgJDBC 9.4 (not yet released), batches that return generated keys always do a round trip for every entry, so they're no better than individual statement executions.

  • Even in 9.4, batches that return generated keys only offer a benefit if the generated values are size limited. A single text, bytea or unconstrained varchar field in the requested result will force the driver to do a round trip for every execution.

The benefit of batching is a reduction in network round trips. So there's much less point if your DB is local to your app server. There's a diminishing return with increasing batch size, because the total time taken in network waits falls off quickly, so it's often not work stressing about trying to make batches as big as possible.

If you're bulk-loading data, seriously consider using the COPY API instead, via PgJDBC's CopyManager, obtained via the PgConnection interface. It lets you stream CSV-like data to the server for rapid bulk-loading with very few client/server round trips. Unfortunately, it's remarkably under-documented - it doesn't appear in the main PgJDBC docs at all, only in the API docs.

like image 186
Craig Ringer Avatar answered Oct 01 '22 19:10

Craig Ringer


AFAIK there is no limit beside the memory issue. regarding your question: the statement is sent to the DB only on execute batch so until you'll execute the batch the memory will continue to grow until you will get JavaHeapSpace or the batch will be sent to the DB.

like image 21
asafm Avatar answered Oct 01 '22 20:10

asafm