Lately were seeing a problem with our RDS database, about every 15-20 min's the database becomes unresponsive for a minute or two. While unresponsive GB's of free disk space is used and then regained. Hard to explain, so I've attached a screenshot of the monitoring graphs, this is for a 1 hour period. Does anyone have any idea whats going on, or where I should start looking?
The disk space utilization makes me think you have very large query result sets sorting in temp tables on disk. To check, look for increases in the counter status variable Created_tmp_disk_tables
when the spikes occur.
mysql> show global status like 'Created%';
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 56 | <-- this is probably the culprit
| Created_tmp_files | 23 |
| Created_tmp_tables | 3177 |
+-------------------------+-------+
If it is, you may have queries that incur temp tables large enough that they can't fit in memory, and have to spool to disk. Unfortunately you can't find out how large these temp results sets are, but I would guess it's on the order of 15 GiB.
You should figure out which queries are generating huge temp tables and try to optimize these queries. Unfortunately, stock MySQL doesn't have good logging info to track this down, and Amazon RDS doesn't allow you to replace the stock MySQL with an enhanced fork of MySQL, e.g. Percona Server, which would give you this information in its slow-query log.
So you'll have to go to your development environment and do some code review of your SQL queries, run them through EXPLAIN one by one, and identify which one is the bottleneck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With