Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow MySQL inserts

I am using and working on software which uses MySQL as a backend engine (it can use others such as PostgreSQL or Oracle or SQLite, but this is the main application we are using). The software was design in such way that the binary data we want to access is kept as BLOBs in individual columns (each table has one BLOB column, other columns have integers/floats to characterize the BLOB, and one string column with the BLOB's MD5 hash). The tables have typically 2, 3 or 4 indexes, one of which is always the MD5 column, which is made UNIQUE. Some tables already have millions of entries, and they have entered the multi-gigabyte in size. We keep separate per-year MySQL databases in the same server (so far). The hardware is quite reasonable (I think) for general applications (a Dell PowerEdge 2U-form server).

MySQL SELECT queries are relatively fast. There's little complaint there, since these are (most of the time) in batch mode. However, INSERT queries take a long time, which increases with table size (number of rows). Admittedly, this is because the MD5 column is of type UNIQUE and so each INSERT has to figure out whether each new row has a corresponding, already-inserted, MD5 string. And it's not too strange (I think) if the performance gets worse if there are other indexes (not unique). But I still can't put my mind to rest that this software architecture choice (I suspect keeping BLOBs in the table row instead of disk has a significant, negative impact) is not the best choice. Insertions are not critical, but it is an annoying feeling to have.

Does anyone have experience in similar situations? With MySQL, or even other (preferably Linux-based) RDBMes? Any insights you would care to provide, maybe some performance figures?

BTW, the working language is C++ (which wraps C calls to MySQL's API).

like image 997
jbatista Avatar asked Sep 01 '09 09:09

jbatista


People also ask

How do you make MySQL insert faster?

You can use the following methods to speed up inserts: If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements.

Why insert SQL slow?

I know that an INSERT on a SQL table can be slow for any number of reasons: Existence of INSERT TRIGGERs on the table. Lots of enforced constraints that have to be checked (usually foreign keys) Page splits in the clustered index when a row is inserted in the middle of the table.

Which is faster Upsert or insert?

Insert is more faster than update because in insert there's no checking of data.

How many inserts can MySQL handle per second?

Normally our database server handles 1,000 inserts / sec.


3 Answers

It could be a time for horizontal partitioning and moving blob field into a separate table. In this article in 'A Quick Side Note on Vertical Partitioning' author removes a larger varchar field from a table and it increases speed of a query about order of magnitude.

The reason is physical traversal of the data on a disk becomes significantly faster if there is less space to cover, so moving bigger fields elsewhere increases performance.

Also (and you probably do it already) it is beneficial to decrease the size of your index column to its absolute minumum (char(32) in ascii encoding for md5), because size of the key is directly proportional to the speed of its use.

If you do multiple inserts at a time with InnoDB tables you can significantly increase speed of inserts by wrapping them into transaction and doing mupliple inserts in one query:

START TRANSACTION
INSERT INTO x (id, md5, field1, field2) values (1, '123dab...', 'data1','data2'),(2,'ab2...','data3','data4'),.....;
COMMIT
like image 175
dimus Avatar answered Oct 09 '22 07:10

dimus


See Speed of INSERT Statements. Do you have frequent MD5 collisions? I believe these should not happen too many times, so maybe you can use something like INSERT ... ON DUPLICATE to handle the collisions. If you have specific insert periods, you can disable keys for the time of the insert and restore them later. Another option is to use replication, using a master machine for the inserts and a slave for the selects.

like image 36
Yuval F Avatar answered Oct 09 '22 07:10

Yuval F


Are you using MyISAM?
AFAIK MyISAM has a very good read-performance, but bad write performance.

InnoDB should be balanced in speed.

like image 41
ppuschmann Avatar answered Oct 09 '22 06:10

ppuschmann