Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it better to create an index before filling a table with data, or after the data is in place?

I have a table of about 100M rows that I am going to copy to alter, adding an index. I'm not so concerned with the time it takes to create the new table, but will the created index be more efficient if I alter the table before inserting any data or insert the data first and then add the index?

like image 622
Drew Stephens Avatar asked Sep 10 '10 22:09

Drew Stephens


People also ask

How we can create indexes after completing the load process?

Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index command. Use this command task in the workflow after the session or else, You can create it with a post session command.


1 Answers

Creating index after data insert is more efficient way (it even often recomended to drop index before batch import and after import recreate it).

Syntetic example (PostgreSQL 9.1, slow development machine, one million rows):

CREATE TABLE test1(id serial, x integer); INSERT INTO test1(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id); -- Time: 7816.561 ms CREATE INDEX test1_x ON test1 (x); -- Time: 4183.614 ms 

Insert and then create index - about 12 sec

CREATE TABLE test2(id serial, x integer); CREATE INDEX test2_x ON test2 (x); -- Time: 2.315 ms INSERT INTO test2(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id); -- Time: 25399.460 ms 

Create index and then insert - about 25.5 sec (more than two times slower)

like image 85
valodzka Avatar answered Sep 19 '22 19:09

valodzka