Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What AWS disk options should I use for my EC2 instance?

Created a new Ubuntu c3.xlarge instance and when I get to storage options I get the option to change ROOT to General Purpose SSD, Provisioned IOPS or magnetic, also if I pick Provisioned IOPS i can set another value. Additional data storage under Instance Store 0 has no options but if change to EBS then I have the same options.

I'm really struggling to understand:

  1. The speed of each option
  2. the costs of each option

The Amazon documentation is very unclear

I'm using this instance to transfer data from text files into a Postgres relational database, these files have to be processed line by line with a number of INSERT statements per line so is slow on my local computer (5 million rows of data takes 15 hours). Originally the database was separately on RDS but it was incredibly slow, so I installed the database locally on the instance itself remove network latency which has speed up things a bit but it is still considerably slower than my local humble linux server.

Looking at the instance logs whilst loading the data CPU instance is only at 6% so now thinking that disk may be limiting the factor. The database will be using the / (Not sure if SSD or magnetic - how can I find out) disk and the data files are on the /mnt (using Instance Store 0) disk.

I only need this instance to do two things:

  1. Load database from datafiles
  2. Create Lucene Search Index from database

(so the database is just an interim step)

The Search Index is transferred to an EBean Server and then I don;t need this instance for another month when I then repeat the process with new data so with that in mind I can afford to spend more money for faster processing because I'm only going to use 1 day a month, then I can stop the instance and incur no further costs ?

Please what can I do to determine the problem and speed things up ?

like image 288
Paul Taylor Avatar asked Jun 26 '14 09:06

Paul Taylor


2 Answers

Here is my personal guideline:

  • If the volume is small (<33G) and only require a eventual burst in performance, such as a boot volume, use magnetic drives.

  • If you need predictable performance and high throughput, use PIOPS volumes and EBS optimized instances.

  • Otherwise, use General Purpose SSD.

like image 141
Julio Faerman Avatar answered Oct 06 '22 03:10

Julio Faerman


Your CPU is only at 6%, maybe you can try to use multi-process?

Did you test your remote instance's volume's I/O performance?

PIOPS is expensive, but it did not significantly better than gp2, the only advantage is stable.

For example, I create a 500G gp2 and a 500G PIOPS with 1500IOPS, then I try to insert and find 1,000,000 documents by mongodb, then I check the io performanace by such as mongoperf/iostat/mongostat/dstat

Each volume's iops performance is expect to 1500, but gp2's iops is unstable, almost from 700 to 1600(r+w), if only read, it can brust to 4000, if only write, it just reach 800. piops is perfect stable, it iops is almost 1470.

To your situation, I suggest to consider about gp2 (volume size depend on your iops demand, 500G gp2 = 1500iops, 1T gp2 = 3000iops(maximum))

like image 36
Laisky Avatar answered Oct 06 '22 03:10

Laisky