Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is MySQL 'insert into ... select ...' so much slower than a select alone?

I'm trying to store a query result in a temporary table for further processing.

create temporary table tmpTest
(
    a FLOAT,
    b FLOAT,
    c FLOAT
)
engine = memory;

insert into tmpTest
(
    select a,b,c from someTable
    where ...
);

But for some reason the insert takes up to a minute, whereas the subselect alone just takes a few seconds. Why would it take so much longer to write the data to a temporary table instead of printing it to my SQL management tool's output???

UPDATE My Setup: MySQL 7.3.2 Cluster with 8 Debian Linux ndb data nodes 1 SQL Node (Windows Server 2012)

The table I'm running the select on is a ndb table.

I tried to find out, if the execution plan would differ when using 'insert into..', but they look the same: (sorry for the formatting, stackoverflow doesn't have tables)

id  select_type     table       type    possible_keys   key     key_len ref                 rows        Extra
1   PRIMARY         <subquery3> ALL     \N              \N      \N      \N                  \N          \N
1   PRIMARY         foo         ref     PRIMARY         PRIMARY 3       <subquery3>.fooId   9747434     Using where
2   SUBQUERY        someTable   range   PRIMARY         PRIMARY 3       \N                  136933000   Using where with pushed condition; Using MRR; Using temporary; Using filesort
3   MATERIALIZED    tmpBar      ALL     \N              \N      \N      \N                  1000        \N

CREATE TABLE ... SELECT is slow, too. 47 seconds vs. 5 seconds without table insert/create.

like image 709
Ben Avatar asked Oct 09 '13 10:10

Ben


2 Answers

I wrote a comment above, then stumbled across this as a workaround.

This will accomplish what you want to do.

SELECT * FROM aTable INTO OUTFILE '/tmp/atable.txt';
LOAD DATA INFILE '/tmp/atable.txt' INTO TABLE anotherTable;

Note that doing this means managing the /tmp tables in some way. If you try to SELECT data into an OUTFILE that already exists, you get an error. So you need to generate unique temporary file names. And then run a cron job of some sort to go clean them up.

I guess INFILE and OUTFILE behave differently. If someone can shed some light on what is going on here to explain mysql behavior, I would appreciate it.

D

Here is a better way than using INFILE / OUTFILE.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO aTable SELECT ... FROM ...

Here is a relevant post to read :

How to improve INSERT INTO ... SELECT locking behavior

like image 108
Don Wool Avatar answered Sep 30 '22 19:09

Don Wool


I experiences the same issue and was playing around with subqueries that actually solved it. If the select has a huge amount of rows, its taking very long to insert the data. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1

using LIMIT in combination to INSERT data is not a good option. So you could use 2 seperate queries for getting data and inserting, or you can use a subquery. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT * FROM (
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
) sub1

that would be a fast solution without changing your script.

So im not sure why its taking 0.01 seconds to run the subquery and 60 seconds to run the insert. I get 1000+ results without the Limit. In my case the subquery improved performance from 60 seconds to 0.01 seconds.

like image 30
joevette Avatar answered Sep 30 '22 17:09

joevette