Just to preface my question, I understand that there is no direct support for something like this. What I am looking for is any sort of work-around, or convoluted derivation that would get me a half-respectable result. I am working with a rather large MySQL cluster (tables > 400 million rows) using the cluster engine. Is anyone aware of a way to either <s>directly retrieve</s> or otherwise derive a somewhat (or better) accurate indication of progress through a long query in mysql? I have some queries that can take up to 45 minutes, and I need to determine if we're 10% or 90% through the processing. EDIT: As requested in the comments here is a distilled and generified version of one of the queries that is leading to my original question... <pre class="prettyprint"><code>SELECT `userId` FROM `openEndedResponses` AS `oe` WHERE `oe`.`questionId` = 3 -- zip code AND (REPLACE( REPLACE( `oe`.`value`, ' ', '' ), '-', '' ) IN ( '30071', '30106', '30122', '30134', '30135', '30168', '30180', '30185', '30187', '30317', '30004' )); </code></pre> This query is run against a single table with ~95 million rows. It takes 8 seconds to run the query and another 13 to transfer the data (21 sec total). Considering the size of the table, and the fact that there are string manipulation functions being used, I'd say it's running pretty damn fast. However, to the user, it's still 21 seconds appearing either stuck or idle. Some indication of progress would be ideal.

I know this is an old question, but I was looking for a similar answer, when trying to figure out how much longer my update would take on a query of 250m rows. If you run: <pre class="prettyprint"><code>SHOW ENGINE INNODB STATUS \G </code></pre> Then under TRANSACTIONS find the transaction in question, examine this section: <pre class="prettyprint"><code>---TRANSACTION 34282360, ACTIVE 71195 sec starting index read mysql tables in use 2, locked 2 1985355 lock struct(s), heap size 203333840, 255691088 row lock(s), undo log entries 21355084 </code></pre> The important bit is "undo log entries". For each updated row, in my case it seemed to add an undo log entry (trying running it again after a few seconds and see how many have been added). If you skip to the end of the status report, you'll see this: <pre class="prettyprint"><code>Number of rows inserted 606188224, updated 251615579, deleted 1667, read 54873415652 0.00 inserts/s, 1595.44 updates/s, 0.00 deletes/s, 3190.88 reads/s </code></pre> Here we can see that the speed updates are being applied is 1595.44 rows per second (although if you're running other update queries in tandem, then this speed might be separated between your queries). So from this, I know 21m have been updated with (250m-21m) 229m rows left to go. 229,000,000 / 1600 = 143,125 seconds to go (143,125 / 60) / 60 = 39.76 hours to go So it would appear I can twiddle my thumbs for another couple of days. Unless this answer is wrong, in which case I'll update it sometime before then!

MySQL Long Query Progress Monitoring

Tags:

sql

mysql

Just to preface my question, I understand that there is no direct support for something like this. What I am looking for is any sort of work-around, or convoluted derivation that would get me a half-respectable result.

I am working with a rather large MySQL cluster (tables > 400 million rows) using the cluster engine.

Is anyone aware of a way to either ~~directly retrieve~~ or otherwise derive a somewhat (or better) accurate indication of progress through a long query in mysql? I have some queries that can take up to 45 minutes, and I need to determine if we're 10% or 90% through the processing.

EDIT:

As requested in the comments here is a distilled and generified version of one of the queries that is leading to my original question...

SELECT `userId` FROM    `openEndedResponses` AS `oe` WHERE     `oe`.`questionId` = 3 -- zip code     AND (REPLACE( REPLACE( `oe`.`value`, ' ', '' ), '-', '' ) IN ( '30071', '30106', '30122', '30134', '30135', '30168', '30180', '30185', '30187', '30317', '30004' ));

This query is run against a single table with ~95 million rows. It takes 8 seconds to run the query and another 13 to transfer the data (21 sec total). Considering the size of the table, and the fact that there are string manipulation functions being used, I'd say it's running pretty damn fast. However, to the user, it's still 21 seconds appearing either stuck or idle. Some indication of progress would be ideal.

209

asked Mar 28 '11 20:03

KOGI

2 Answers

I know this is an old question, but I was looking for a similar answer, when trying to figure out how much longer my update would take on a query of 250m rows.

If you run:

SHOW ENGINE INNODB STATUS \G

Then under TRANSACTIONS find the transaction in question, examine this section:

---TRANSACTION 34282360, ACTIVE 71195 sec starting index read mysql tables in use 2, locked 2 1985355 lock struct(s), heap size 203333840, 255691088 row lock(s), undo log entries 21355084

The important bit is "undo log entries". For each updated row, in my case it seemed to add an undo log entry (trying running it again after a few seconds and see how many have been added).

If you skip to the end of the status report, you'll see this:

Number of rows inserted 606188224, updated 251615579, deleted 1667, read 54873415652 0.00 inserts/s, 1595.44 updates/s, 0.00 deletes/s, 3190.88 reads/s

Here we can see that the speed updates are being applied is 1595.44 rows per second (although if you're running other update queries in tandem, then this speed might be separated between your queries).

So from this, I know 21m have been updated with (250m-21m) 229m rows left to go.

229,000,000 / 1600 = 143,125 seconds to go (143,125 / 60) / 60 = 39.76 hours to go

So it would appear I can twiddle my thumbs for another couple of days. Unless this answer is wrong, in which case I'll update it sometime before then!

164

answered Sep 24 '22 14:09

lightsurge

I was able to estimate something like this by querying the number of rows to process then breaking the processing into a loop, working on only a subset of the total rows at a time.

The full loop was rather involved, but the basic logic went like:

SELECT @minID = Min(keyColumn) FROM table WHERE condition SELECT @maxID = Max(keyColumn) FROM table WHERE condition SELECT @potentialRows = (@maxID - @minID) / @iterations  WHILE @minID < @maxID BEGIN     SET @breakID = @minID + @potentialRows     SELECT columns FROM table WITH (NOLOCK, ...)     WHERE condition AND keyColumn BETWEEN @minID AND @breakID      SET @minID = @breakID + 1 END

Note this works best if IDs are evenly distributed.

answered Sep 22 '22 14:09

Dour High Arch

Related questions
                            
                                How to connect from windows command prompt to mysql command line
                            
                                Testing an Entity Framework database connection
                            
                                MySQL Select last 7 days
                            
                                Hibernate: Create Mysql InnoDB tables instead of MyISAM
                            
                                How to insert utf-8 mb4 character(emoji in ios5) in mysql?
                            
                                MySql 5.7 installer fails to detect VS 2013 redistributable
                            
                                Mysql transactions within transactions
                            
                                Finding the next available id in MySQL
                            
                                ERROR! MySQL manager or server PID file could not be found! QNAP
                            
                                MySQL: SyntaxError: Unexpected identifier
                            
                                using group_concat in PHPMYADMIN will show the result as [BLOB - 3B]
                            
                                How to get the difference between two dates rounded to hours
                            
                                How do I select between the 1st day of the current month and current day in MySQL?
                            
                                How to execute MySQL command from the host to container running MySQL server?
                            
                                Generating a range of numbers in MySQL
                            
                                How do I extract Month and Year in a MySQL date and compare them?
                            
                                LOAD DATA INFILE Error Code : 13
                            
                                Referencing outer query's tables in a subquery
                            
                                How to create only one table with SQLAlchemy?
                            
                                How will changing the mysql field type from INT to VARCHAR affect data previously stored as an INT

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With