Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are batch inserts/updates faster? How do batch updates work?

Tags:

sql

rdbms

Why are batch inserts faster? Is it because the connection and setup overhead for inserting a single row is the same for a set of rows? What other factors make batch inserts faster?

How do batch updates work? Assuming the table has no uniqueness constraints, insert statements don't really have any effect on other insert statements in the batch. However, during batch updates, an update can alter the state of the table and hence can affect the outcome of other update queries in the batch.

I know that batch insert queries have a syntax where you have all the insert values in one big query. How do batch update queries look like? For e.g. if i have single update queries of the form:

update <table> set <column>=<expression> where <condition1> update <table> set <column>=<expression> where <condition2> update <table> set <column>=<expression> where <condition3> update <table> set <column>=<expression> where <condition4> 

What happens when they are used in a batch. What will the single query look like ?

And are batch inserts & updates part of the SQL standard?

like image 317
letronje Avatar asked Jun 17 '09 13:06

letronje


People also ask

How does batch insert work?

In a batch updates, the database works against a set of data, in a row by row update it has to run the same command as may times as there are rows. So if you insert a million rows in a batch, the command is sent and processed once and in a row-by row update, it is sent and processed a million times.

Is insert or update faster?

Insert would be faster because in case of update you need to first search for the record that you are going to update and then perform the update.

What does Batch update mean?

A batch update is a set of multiple update statements that is submitted to the database for processing as a batch. Sending multiple update statements to the database together as a unit can, in some situations, be much more efficient than sending each update statement separately.

Why do we use batch updates?

Batch update allows sending a bulk of statements to the database server to execute all at once, which greatly improves the performance. The program will run much faster if batch update is used.


2 Answers

I was looking for an answer on the same subject, about "bulk/batch" update. People often describe the problem by comparing it with insert clause with multiple value sets (the "bulk" part).

INSERT INTO mytable (mykey, mytext, myint) VALUES    (1, 'text1', 11),   (2, 'text2', 22),   ... 

Clear answer was still avoiding me, but I found the solution here: http://www.postgresql.org/docs/9.1/static/sql-values.html

To make it clear:

UPDATE mytable SET    mytext = myvalues.mytext,   myint = myvalues.myint FROM (   VALUES     (1, 'textA', 99),     (2, 'textB', 88),     ... ) AS myvalues (mykey, mytext, myint) WHERE mytable.mykey = myvalues.mykey 

It has the same property of being "bulk" aka containing alot of data with one statement.

like image 107
Qerr Avatar answered Oct 06 '22 00:10

Qerr


Why are batch inserts faster?

For numerous reasons, but the major three are these:

  • The query doesn't need to be reparsed.
  • The values are transmitted in one round-trip to the server
  • The commands are inside a single transaction

Is it because the connection and setup overhead for inserting a single row is the same for a set of rows?

Partially yes, see above.

How do batch updates work?

This depends on RDBMS.

In Oracle you can transmit all values as a collection and use this collection as a table in a JOIN.

In PostgreSQL and MySQL, you can use the following syntax:

INSERT INTO    mytable VALUES          (value1),         (value2),         … 

You can also prepare a query once and call it in some kind of a loop. Usually there are methods to do this in a client library.

Assuming the table has no uniqueness constraints, insert statements don't really have any effect on other insert statements in the batch. But, during batch updates, an update can alter the state of the table and hence can affect the outcome of other update queries in the batch.

Yes, and you may or may not benefit from this behavior.

I know that batch insert queries have a syntax where you have all the insert values in one big query. How do batch update queries look like?

In Oracle, you use collection in a join:

MERGE INTO    mytable USING   TABLE(:mycol) ON      … WHEN MATCHED THEN UPDATE SET     … 

In PostgreSQL:

UPDATE  mytable SET     s.s_start = 1 FROM    (         VALUES         (value1),         (value2),         …         ) q WHERE   … 
like image 32
Quassnoi Avatar answered Oct 05 '22 23:10

Quassnoi