Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bulk insert with or without index

In a comment I read

Just as a side note, it's sometimes faster to drop the indices of your table and recreate them after the bulk insert operation.

Is this true? Under which circumstances?

like image 410
khooler Avatar asked Dec 08 '08 16:12

khooler


People also ask

Should I drop index before insert?

Removing indexes prior to large inserts on a table, including when using SQL Bulk Insert, may be a best practice to increase performance.

Should I create index before or after insert?

You should create an index for a table after inserting or loading data (via SQL*Loader or Import) into the table. It is more efficient to insert rows of data into a table that has no indexes and then create the indexes for subsequent access.

Does indexes improve insert performance?

The number of indexes on a table is the most dominant factor for insert performance. The more indexes a table has, the slower the execution becomes. The insert statement is the only operation that cannot directly benefit from indexing because it has no where clause.

Do indexes improve inserts and updates?

So having a lot of indexes can speed up select statements, but slow down inserts, updates, and deletes. Note: Updates and deletes with WHERE clauses can use indexes for scans, even if the indexed column is being updated.


2 Answers

As with Joel I will echo the statement that yes it can be true. I've found that the key to identifying the scenario that he mentioned is all in the distribution of data, and the size of the index(es) that you have on the specific table.

In an application that I used to support that did a regular bulk import of 1.8 million rows, with 4 indexes on the table, 1 with 11 columns, and a total of 90 columns in the table. The import with indexes took over 20 hours to complete. Dropping the indexes, inserting, and re-creating the indexes only took 1 hour and 25 minutes.

So it can be a big help, but a lot of it comes down to your data, the indexes, and the distribution of data values.

like image 113
Mitchel Sellers Avatar answered Sep 18 '22 20:09

Mitchel Sellers


Yes, it is true. When there are indexes on the table during an insert, the server will need to be constantly re-ordering/paging the table to keep the indexes up to date. If you drop the indexes, it can just add the rows without worrying about that, and then build the indexes all at once when you re-create them.


The exception, of course, is when the import data is already in index order. In fact, I should note that I'm working on a project right now where this opposite effect was observed. We wanted to reduce the run-time of a large import (nightly dump from a mainframe system). We tried removing the indexes, importing the data, and re-creating them. It actually significantly increased the time for the import to complete. But, this is not typical. It just goes to show that you should always test first for your particular system.

like image 29
Joel Coehoorn Avatar answered Sep 16 '22 20:09

Joel Coehoorn