Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do large transactions have linearly- or exponentially-increasing performance/resource cost?

I have a very general question:

1.) Given I have a transaction (in mysql) that comprises the modification of ten-/hundred thousand or even million of records. Are these kind of transaction linearly expensive compared to a very "small" transaction? Or will large transaction consume exponentially more resources than a small transaction? (said in other word, have to be avoided).

2.) Is it true that, when committing such a large transaction, the load on mysql will then jump to a very high value, until the transaction has been processed completely?

The Question is related mainly to MySQL, but if there are server that are "much better", than it would be interesting to hear about them. And yes, the Server/System has quite a bit of load from other processes (so it's not the only transaction being processed on the system = medium/high contention). And the isolation levels are the "higher/highest" isolation levels (that bring all the benefits of real transactions).

Thank you very much

like image 798
Markus Avatar asked Jun 17 '11 18:06

Markus


1 Answers

This is not a complete response, I hope someone will give you some usefull hints on that subject.

But I can give you at least one tip.

For sure with serializable isolation level the biggest problem you'll have is when your transaction is long you've got a lot a chance to have it automatically rolled-back by the engine. In serializable isolation level if any other transaction commits something altering one of your impacted data your transaction is off. In serialization isolation level you have to think your transaction as something you'll maybe have to redo several times until it's ok. So... if it's big and you're not alone it will maybe become very long, or maybe impossible to run.

If we talk about altering all rows of a million rows table in a serializable transaction you'll certainly have to put an application level lock, or a semaphore, something higher than the database transaction to tell other process that you're making an heavy task and that they should wait a little and let you do that important task :-)

But if you can think this process as a maintenance task on each row and that it's not a problem for your environment that some of the rows are in the new state and some others not yet... then do a transaction for each row and not a big one. You should do big serializable transactions only if it's really important that all impacted rows should switch their status in the same time (this is Atomicity). Chances are that this is not the case, is it?

like image 162
regilero Avatar answered Sep 21 '22 06:09

regilero