Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL on EC2/EBS. Too slow?

I have a MySQL m2.2xlarge instance on AWS. The MySQL data dir resides in the root EBS /. It is a single EBS not RAID. We have three main tables. One of them Table C, the largest in content, is used only the last days worth of data. The Insert rate in these tables is around 80.000 rows A DAY. The 3 tables have around 42 million rows. The innodb_buffer_pool_size has ~30GB of the instance RAM.

Table A is the most important, its data length is ~33GB and index ~11GB Table B has data length is ~8GB and index ~5GB

In our website, the two main queries (latency-wise) are like this:

SELECT * FROM TableA WHERE id in (.....)

SELECT * FROM TableB JOIN .... WHERE id in (.....)

In most pages the (...) will be some ~50 recent ids with these queries taking < 50 ms each. But in some other pages we hit older ids and the latency for these queries skyrocket to 500ms, 800ms, up to 1.5 secs.

I've done a test where, after a Mysql restart, I did a SELECT id FROM TableB to force index into cache/memory. The Table B query would still be slow. Then I did a SELECT * FROM TableB. And now with the whole table in cache/memory the queries become really fast (<50ms).

My question: > 500 ms, > 1000ms is a reasonable latency for a query that just retrieves rows by PRIMARY KEY? Even in a 42M table? Even when all rows are in disk? It seems too much for me.

Would moving MySQL data to ephemeral storage (/mnt) improve this? Would using Provisioned IOPS help with?

like image 807
Felipe Hummel Avatar asked Feb 12 '13 18:02

Felipe Hummel


People also ask

Why is EBS so slow?

The creation of EBS-backed AMIs or EBS snapshots might be slow because of the large amount of data that must be copied to Amazon Simple Storage Service (Amazon S3). Many factors, such as Write activity on the EBS volume, can impact the creation time.

Why is AWS EC2 so slow?

Short description. There are many possible causes of this problem, including problems with an external service that your instance relies on, disk thrashing, or network connectivity issues. This article discusses one common cause: depleted I/O burst credits on the gp2 root volume.

Is instance storage faster than EBS?

Conclusions: Instance-store is over 5x faster than EBS-SSD for uncached reads. Instance-store and EBS-SSD are equalivent for cached reads. Instance-store is over 10x faster than EBS-SSD for writes.


1 Answers

Disclaimer: I'm no expert on (My)SQL performance at all, just commenting on the AWS aspects of your use case.

With that out of the way, there are several questions to address, first and foremost:

Would moving MySQL data to ephemeral storage (/mnt) improve this?

I've provided an answer for the identical question Will moving data from EBS to ephemeral storage improve MySQL query performance?, please check it out for some important details - TL;DR: You most definitely don't want to do that if you have any durability needs (except if you exactly know what you are doing), and performance gains via ephemeral storage claimed in the past are also dubious at best, if not plain wrong from today's perspective.

Would using Provisioned IOPS help with?

Absolutely, Provisioned IOPS Volumes are specifically designed to meet the needs of I/O-intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency in random access I/O throughput, see the post Fast Forward - Provisioned IOPS for EBS Volumes for a general introduction.

  • Please note that these ideally (but not necessarily) go hand in hand with EBS-Optimized Instances, which use an optimized configuration stack and provides additional, dedicated capacity for EBS I/O. This optimization provides the best performance for your EBS volumes by minimizing contention between EBS I/O and other traffic from your Amazon EC2 instance.

  • Specifically you'll want to read through the dedicated section Increasing EBS Performance, which addresses how to look at the I/O performance you require and your options for increasing EBS performance to meet those requirements with RAID and/or Provisioned IOPS depending on your use case.

My question: > 500 ms, > 1000ms is a reasonable latency for a query that just retrieves rows by PRIMARY KEY? Even in a 42M table? Even when all rows are in disk? It seems too much for me.

As mentioned I can't judge the values as such, however, given your specification you seem to have memory contention, insofar the m2.2xlarge instance features 'only' 34.2 GiB of memory and you are allocating ~30GB for the innodb_buffer_pool_size already - this seem to be a bit high to me given other memory requirements of the OS and/or MySQL, so there might already be swapping involved, which would perfectly explain the cache/memory warming behavior you are experiencing.

  • As a general recommendation for database workloads it seems to be the biggest bang for the buck by far these days to simply ensure your dataset fits entirely into RAM, which is easier than ever with the plethora of instance types (if at all feasible in the first place).

Finally I recommend to read the very recent post about Improving PostgreSQL performance on AWS EC2 - the recommendations there primarily address the AWS side of things as well and do apply to MySQL too accordingly; section Durable databases pretty much summarizes my suggestions above:

For a durable database where you care about your data, what you want instead of a high I/O instance is an EBS Optimized instance, which has guaranteed network bandwidth to the EBS storage servers. Use EBS volumes with provisioned IOPs and, for best results, stripe a group of EBS volumes into a RAID10 array. See increasing EBS performance.

like image 126
Steffen Opel Avatar answered Sep 23 '22 00:09

Steffen Opel