Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server / MySQL / Access - speeding up inserting many rows in an inefficient manner

SETUP

I have to insert a couple million rows in either SQL Server 2000/2005, MySQL, or Access. Unfortunately I don't have an easy way to use bulk insert or BCP or any of the other ways that a normal human would go about this. The inserts will happen on one particular database but that code needs to be db agnostic -- so I can't do bulk copy, or SELECT INTO, or BCP. I can however run specific queries before and after the inserts, depending on which database I'm importing to.

eg.

If IsSqlServer() Then
    DisableTransactionLogging();
ElseIf IsMySQL() Then
    DisableMySQLIndices();
End If

... do inserts ...

If IsSqlServer() Then
    EnableTransactionLogging();
ElseIf IsMySQL() Then
    EnableMySQLIndices();
End If

QUESTION

Are there any interesting things I can do to SQL Server that might speed up these inserts?

For example, is there a command I could issue to tell SQL Server, "Hey, don't bother recording these transactions in the transaction log".

Or maybe I could say, "Hey, I have a million rows coming in, so don't update your index until I'm totally finished".

ALTER INDEX [IX_TableIndex] ON Table DISABLE
     ... inserts
ALTER INDEX [IX_TableIndex] ON Table REBUILD

(Note: Above index disable only works on 2005, not 2000. Bonus points if you know a way to do this on 2000).

What about MySQL, and Access?

like image 740
Michael Pryor Avatar asked Jan 07 '09 15:01

Michael Pryor


2 Answers

The single biggest thing that will kill performance here is the fact that (it sounds like) you're executing a million different INSERTs against the DB. Each INSERT is treated as a single operation. If you can do this as a single operation, then you will almost certainly have a huge performance improvement.

Both MySQL and SQL Server support 'selects' of constant expressions without a table name, so this should work as one statement:

INSERT INTO MyTable(ID, name)
SELECT 1, 'Fred'
UNION ALL SELECT 2, 'Wilma'
UNION ALL SELECT 3, 'Barney'
UNION ALL SELECT 4, 'Betty'

It's not clear to me if Access supports that, not having Access available. HOWEVER, Access does support constants in a SELECT, as far as I can tell, and you can coerce the above into ANSI SQL-92 (which should be supported by all 3 engines; it's about as close to 'DB agnostic' as you'll get) by just adding

FROM OneRowTable

to the end of every individual SELECT, where 'OneRowTable' is a table with just one row of dummy data.

This should let you insert a million rows of data in much much less than a million INSERT statements -- and things like index reshuffling will be done once, rather than a million times. You may have much less need for other optimisations after that.

like image 153
Cowan Avatar answered Sep 27 '22 22:09

Cowan


is this a regular process or a one time event?

I have, in the past, just scripted out the current indexes, dropped them, inserted the rows, then just re-add the indexes.

The SQL Management Studio can script out the indexes from the right click menus...

like image 41
Jeff Martin Avatar answered Sep 27 '22 21:09

Jeff Martin