I'll start off with something from the MySQL Online DDL Limitations page:
There is no mechanism to pause an online DDL operation or to throttle I/O or CPU usage for an online DDL operation.
However, I'm still interested in solutions that I might have missed.
The situation: Indexes are getting larger and larger, and they're getting so large that there won't be enough memory for the queries that are used, causing disk I/O to skyrocket, and everything to descend into utter chaos. New composite indexes have been created that are smaller, but the problem is running the ALTER TABLE
without breaking anything.
The facts are as follows:
id
column, but this is not unique.The SHOW CREATE TABLE
(I didn't include all partitions):
CREATE TABLE `my_wonky_table` (
`id` bigint(20) unsigned NOT NULL,
`login` varchar(127) DEFAULT NULL,
`timestamp` int(10) unsigned NOT NULL,
`ip` varchar(32) CHARACTER SET ascii DEFAULT NULL,
`val_1` int(10) unsigned DEFAULT NULL,
`val_2` varchar(127) DEFAULT NULL,
`val_3` varchar(255) DEFAULT NULL,
`val_4` varchar(127) DEFAULT NULL,
`val_5` int(10) unsigned DEFAULT NULL,
KEY `my_wonky_table_id_idx` (`id`),
KEY `my_wonky_table_timestamp_idx` (`timestamp`),
KEY `my_wonky_table_val_1_idx` (`val_1`,`id`),
KEY `my_wonky_table_val_2_idx` (`val_2`,`id`),
KEY `my_wonky_table_val_4_idx` (`val_4`,`id`),
KEY `my_wonky_table_val_5_idx` (`val_5`,`id`),
KEY `my_wonky_table_ip_idx` (`ip`,`id`),
KEY `my_wonky_table_login_idx` (`login`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`id`)
(PARTITION pdefault VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
Regarding queries: It's always a SELECT
on id
, with everything else being used to filter.
What I would like to avoid:
I've thought of using the pt-online-schema-change
tool to throttle, but ran into the no primary key wall. A different solution would be to do this in code, effectively moving the triggers to the code base, and slowly copying over data using somewhat weird chunks (e.g. chunks of an hour's worth of data using a timestamp column) because there's no unique index.
Are there other solutions and/or tools available?
I present this as a separate Answer, since the innermost part is totally different.
As with my other answer, you need the new
table with new indexes, plus a script to copy all the data over. However, the meat is to simulate the Trigger in your application.
Fortunately, you have id
, even though it is not the PRIMARY KEY
. And, even if it is not UNIQUE
, it can be used (assuming you don't have thousands of rows with the same id -- if you do, we can talk further).
The "copy script" and the application talk to each other.
The copy script is in a long loop:
SELECT GET_LOCK('copy', 5), high_water_mark FROM tbl;
-- (or some other timeout)id BETWEEN high_water_mark AND high_water_mark + 999
.UPDATE tbl SET high_water_mark = high_water_mark + 1000;
ids
The application, when reading, continues to read from the old table. But when writing, it does:
SELECT GET_LOCK('copy', 5), high_water_mark FROM tbl;
-- (or some other timeout)id
<= high_water_mark
, write to new table also.SELECT RELEASE_LOCK('copy');
Monitor the progress. At some point, you will need to stop everything, copy the last few rows and do the RENAME TABLE
.
I don't know your optimal values for timeouts, sleep, or chunk size. But I don't think it is wise for chunk size to be bigger than 1K.
This technique has advantages for a variety of changes you might need to do in the future, so keep the guts in place.
This is going to come down to what MySQL variant & version you are using, but if it's one thread per connection (my.cnf thread_handling=one-thread-per-connection
, which might be default on your build), and you can put your ALTER TABLE
workload in a new connection, then workload is a unique PID, and you can use ionice
/renice
on it.
I have somewhat of a crappy answer, but it's less invasive than the other options.
If you look at ps -eLf |grep mysql
you can see the threads/lightweight-processes, and just need to figure out what PID belongs to your specific connection. If you connect via TCP, you could match your local connection port and map that against lsof to find the specific thread. Other ways are possible w/ strace, systemtap and more, or running an initial query you can watch for.
After that, you can use ionice
/renice
to affect the PID on the system. You will really want to make sure you capture what PID it was, and reset the nice & priority level afterwards, to not affect anything else.
As with the others, you really need to reshape this table in the long term. Partitions are useful, but not the endgame, since you're running 1.3TiB of online data, and you state that you only need to read from the most recent 3-6 partitions. Coming from MySQL before native partitions were added, I think this would be a good case for a VIEW and separate tables (atomically update the VIEW when you need to rollover). It would also let you trivially move some older tables to offline storage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With