Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL batch insert on multiple tables with LAST_INSERT_ID()

Tags:

database

mysql

I am trying to insert a lot of users into a MySQL database with two tables:

The first table contains the user data. An example INSERT looks like this (id is the primary key, mail is a unique key):

INSERT INTO users (id, mail, name)  
VALUES (NULL, "[email protected]", "John Smith") 
ON DUPLICATE KEY UPDATE name = VALUE(name)

The second table contains the group the user belongs to. It only stores two foreign keys users_id and groups_id. An example query looks like this:

INSERT INTO users_groups (users_id, groups_id)
VALUES (LAST_INSERT_ID(), 1)

This setup works perfectly fine for small data sets. When I import large amounts of data (>1M rows) the INSERTs get slow. Obviously, it would be much better to do a batch insert:

INSERT INTO users (id, mail, name)  
VALUES (NULL, "[email protected]", "John Smith"), (NULL, "[email protected]", "Anna Smith") 
ON DUPLICATE KEY UPDATE name = VALUE(name)

and:

INSERT INTO users_groups (users_id, groups_id)
VALUES (LAST_INSERT_ID(), 1), (LAST_INSERT_ID(), 4)

The problem of course is, that LAST_INSERT_ID() only returns one (the first) id of a batch INSERT.
So, what I would need is a "nested" batch INSERT, which IMO does not exist in MySQL.

What can I do to make my INSERTs faster?

like image 928
Horen Avatar asked Dec 01 '14 09:12

Horen


People also ask

How insert values into multiple tables in MySQL?

MySQL INSERT multiple rows statement In this syntax: First, specify the name of table that you want to insert after the INSERT INTO keywords. Second, specify a comma-separated column list inside parentheses after the table name. Third, specify a comma-separated list of row data in the VALUES clause.

Does MySQL have bulk insert?

Using Bulk Insert Statement in MySQL. The INSERT statement in MySQL also supports the use of VALUES syntax to insert multiple rows as a bulk insert statement. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas.

Can we insert into two tables at the same time?

The T-SQL function OUTPUT, which was introduced in 2005, can be used to insert multiple values into multiple tables in a single statement. The output values of each row that was part of an INSERT, UPDATE or DELETE operation are returned by the OUTPUT clause.

How do you insert data into two tables?

To insert records from multiple tables, use INSERT INTO SELECT statement. Here, we will insert records from 2 tables.


2 Answers

Bulk inserts by default provide sequential auto increments, with this knowledge you can do your inserts like;

INSERT INTO users (id, mail, name)  
VALUES  (NULL, "[email protected]", "John Smith"), 
        (NULL, "[email protected]", "Anna Smith"),
        (...)  # repeat n-times
;

SET @LASTID=LAST_INSERT_ID()
;

INSERT INTO users_groups (users_id, groups_id)
VALUES    (@LASTID - n  , 1), # Note n in descending sequence
          (@LASTID - n-1, 1),
          ...
          (@LASTID - 1  , 1), 
          (@LASTID - 0  , 4)
;

For more information on bulk inserts and auto increment have a look at http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html

Importantly, make sure that innodb_autoinc_lock_mode=1

show global variables like 'innodb_autoinc_lock_mode'

Otherwise consider wrapping your inserts in LOCK TABLES

LOCK TABLES tbl_name WRITE
... sqls ...
UNLOCK TABLES
like image 194
harvey Avatar answered Sep 18 '22 12:09

harvey


If you're putting millions of known rows into a table all at once, consider using LOAD DATA INFILE since it's intended for speed in just that type of scenario, as evidenced by this quote from the docs:

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.

And at Speed of INSERT Statements:

When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements.

This is assuming that your source data is coming from, or could be provided as, a text file. If you have the group ID in the file as well, you might do something like this:

CREATE TEMPORARY TABLE load_users_groups (
  mail VARCHAR(60),
  name VARCHAR(60),
  groupid INT,
  PRIMARY KEY (mail, name)
);

LOAD DATA INFILE '/path/to/file.csv'
INTO TABLE load_users_groups
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';  -- use whatever optional syntax required to parse your file

INSERT INTO users (mail, name)
SELECT mail, name FROM load_users_groups
ON DUPLICATE KEY UPDATE name = VALUES(name);

INSERT INTO users_groups (users_id, groups_id)
SELECT users.id, load_users_groups.groupid
FROM users JOIN load_users_groups USING (mail, name);

DROP TEMPORARY TABLE load_users_groups;

Whether this approach ends up being faster than your current approach depends on whether you save more time using LOAD DATA INFILE than you spend performing two additional INSERT ... SELECT statements to move the data into your desired tables. You may want to tweak keys on the temporary table; I can't benchmark it for you based solely on the contents of your question. I'd be interested to know how it works out, though.

The documentation also has a decent number of tips for Bulk Data Loading for InnoDB Tables and Bulk Data Loading for MyISAM Tables. I won't go through them in detail, not least because you haven't given us any DDL or server info, but you may find it helpful to read through one or the other on your own time.

like image 37
Air Avatar answered Sep 20 '22 12:09

Air