Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Long Query Progress Monitoring

Tags:

sql

mysql

Just to preface my question, I understand that there is no direct support for something like this. What I am looking for is any sort of work-around, or convoluted derivation that would get me a half-respectable result.

I am working with a rather large MySQL cluster (tables > 400 million rows) using the cluster engine.

Is anyone aware of a way to either directly retrieve or otherwise derive a somewhat (or better) accurate indication of progress through a long query in mysql? I have some queries that can take up to 45 minutes, and I need to determine if we're 10% or 90% through the processing.

EDIT:

As requested in the comments here is a distilled and generified version of one of the queries that is leading to my original question...

SELECT `userId` FROM    `openEndedResponses` AS `oe` WHERE     `oe`.`questionId` = 3 -- zip code     AND (REPLACE( REPLACE( `oe`.`value`, ' ', '' ), '-', '' ) IN ( '30071', '30106', '30122', '30134', '30135', '30168', '30180', '30185', '30187', '30317', '30004' )); 

This query is run against a single table with ~95 million rows. It takes 8 seconds to run the query and another 13 to transfer the data (21 sec total). Considering the size of the table, and the fact that there are string manipulation functions being used, I'd say it's running pretty damn fast. However, to the user, it's still 21 seconds appearing either stuck or idle. Some indication of progress would be ideal.

like image 209
KOGI Avatar asked Mar 28 '11 20:03

KOGI


People also ask

How can I see long running queries in MySQL?

Run the 'show processlist;' query from within MySQL interactive mode prompt. (Adding the 'full' modifier to the command disables truncation of the Info column. This is necessary when viewing long queries.) Pro: Using the full modifier allows for seeing the full query on longer queries.

Why is MySQL query taking so long?

There are a number of things that may cause a query to take longer time to execute: Inefficient query – Use non-indexed columns while lookup or joining, thus MySQL takes longer time to match the condition. Table lock – The table is locked, by global lock or explicit table lock when the query is trying to access it.

How do I check long running queries in MariaDB?

How to Verify Queries Running on MySQL/MariaDB Server Using MySQL Show Full Processlist Command. MySQL “SHOW FULL PROCESSLIST” statement shows detailed information about the running queries. Use the “\ G” delimiter to print the result into a more readable format.


2 Answers

I know this is an old question, but I was looking for a similar answer, when trying to figure out how much longer my update would take on a query of 250m rows.

If you run:

SHOW ENGINE INNODB STATUS \G 

Then under TRANSACTIONS find the transaction in question, examine this section:

---TRANSACTION 34282360, ACTIVE 71195 sec starting index read mysql tables in use 2, locked 2 1985355 lock struct(s), heap size 203333840, 255691088 row lock(s), undo log entries 21355084 

The important bit is "undo log entries". For each updated row, in my case it seemed to add an undo log entry (trying running it again after a few seconds and see how many have been added).

If you skip to the end of the status report, you'll see this:

Number of rows inserted 606188224, updated 251615579, deleted 1667, read 54873415652 0.00 inserts/s, 1595.44 updates/s, 0.00 deletes/s, 3190.88 reads/s 

Here we can see that the speed updates are being applied is 1595.44 rows per second (although if you're running other update queries in tandem, then this speed might be separated between your queries).

So from this, I know 21m have been updated with (250m-21m) 229m rows left to go.

229,000,000 / 1600 = 143,125 seconds to go (143,125 / 60) / 60 = 39.76 hours to go

So it would appear I can twiddle my thumbs for another couple of days. Unless this answer is wrong, in which case I'll update it sometime before then!

like image 164
lightsurge Avatar answered Sep 24 '22 14:09

lightsurge


I was able to estimate something like this by querying the number of rows to process then breaking the processing into a loop, working on only a subset of the total rows at a time.

The full loop was rather involved, but the basic logic went like:

SELECT @minID = Min(keyColumn) FROM table WHERE condition SELECT @maxID = Max(keyColumn) FROM table WHERE condition SELECT @potentialRows = (@maxID - @minID) / @iterations  WHILE @minID < @maxID BEGIN     SET @breakID = @minID + @potentialRows     SELECT columns FROM table WITH (NOLOCK, ...)     WHERE condition AND keyColumn BETWEEN @minID AND @breakID      SET @minID = @breakID + 1 END 

Note this works best if IDs are evenly distributed.

like image 45
Dour High Arch Avatar answered Sep 22 '22 14:09

Dour High Arch