Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Java disk I/O perform so much slower than the equivalent I/O code written in C?

I have an SSD disk which should supply not less than 10k IOPS per specification. My benchmark confirms that it can give me 20k IOPS.

Then I create such a test:

private static final int sector = 4*1024;
private static byte[] buf = new byte[sector];
private static int duration = 10; // seconds to run
private static long[] timings = new long[50000];
public static final void main(String[] args) throws IOException {
    String filename = args[0];
    long size = Long.parseLong(args[1]);
    RandomAccessFile raf = new RandomAccessFile(filename, "r");
    Random rnd = new Random();
    long start = System.currentTimeMillis();
    int ios = 0;
    while (System.currentTimeMillis()-start<duration*1000) {
        long t1 = System.currentTimeMillis();
        long pos = (long)(rnd.nextDouble()*(size>>12));
        raf.seek(pos<<12);
        int count = raf.read(buf);
        timings[ios] = System.currentTimeMillis() - t1;
        ++ios;
    }
    System.out.println("Measured IOPS: " + ios/duration);
    int totalBytes = ios*sector;
    double totalSeconds = (System.currentTimeMillis()-start)/1000.0;
    double speed = totalBytes/totalSeconds/1024/1024;
    System.out.println(totalBytes+" bytes transferred in "+totalSeconds+" secs ("+speed+" MiB/sec)");
    raf.close();
    Arrays.sort(timings);
    int l = timings.length;
    System.out.println("The longest IO = " + timings[l-1]);
    System.out.println("Median duration = " + timings[l-(ios/2)]);
    System.out.println("75% duration = " + timings[l-(ios * 3 / 4)]);
    System.out.println("90% duration = " + timings[l-(ios * 9 / 10)]);
    System.out.println("95% duration = " + timings[l-(ios * 19 / 20)]);
    System.out.println("99% duration = " + timings[l-(ios * 99 / 100)]);
}

And then I run this example and get just 2186 IOPS:

$ sudo java -cp ./classes NioTest /dev/disk0 240057409536
Measured IOPS: 2186
89550848 bytes transferred in 10.0 secs (8.540234375 MiB/sec)
The longest IO = 35
Median duration = 0
75% duration = 0
90% duration = 0
95% duration = 0
99% duration = 0

Why does it work so much slower than same test in C?

Update: here is Python code which gives 20k IOPS:

def iops(dev, blocksize=4096, t=10):

    fh = open(dev, 'r')
    count = 0
    start = time.time()
    while time.time() < start+t:
        count += 1
        pos = random.randint(0, mediasize(dev) - blocksize) # need at least one block left
        pos &= ~(blocksize-1)   # sector alignment at blocksize
        fh.seek(pos)
        blockdata = fh.read(blocksize)
    end = time.time()
    t = end - start
    fh.close()

Update2: NIO code (just a piece, will not duplicate all the method)

...
RandomAccessFile raf = new RandomAccessFile(filename, "r");
InputStream in = Channels.newInputStream(raf.getChannel());
...
int count = in.read(buf);
...
like image 250
Antonio Avatar asked Dec 10 '22 22:12

Antonio


2 Answers

Your question is based on the false assumption that C code analogous to your Java code would perform as well as IOMeter does. Because this assumption is false, there is no discrepancy between C performance and Java performance to explain.

If your question is why your Java code performs so badly relative to IOMeter, the answer is that IOMeter doesn't issue requests one at a time like your code does. To get the full performance from your SSD, you need to keep its request queue non-empty, and waiting for each read to finish before issuing the next can't possibly do that.

Try using a pool of threads to issue your requests.

like image 132
David Schwartz Avatar answered Jan 18 '23 16:01

David Schwartz


From this article and it is dated, legacy java random access is 2.5 to 3.5 times slower. It's a research pdf so don't blame me for your clicking it.

Link: http://pages.cs.wisc.edu/~guo/projects/736.pdf

Java raw I/O is slower than C/C++, since system calls in Java are more expensive; buffering improves Java I/O performance, for it reduces system calls, yet there is no big gain for larger buffer size; direct buffering is better than the Java-provided buffered I/O classes, since the user can tailor it for his own needs; increasing the operation size helps I/O performance without overheads; and system calls are cheap in Java native methods, while the overhead of calling native methods is rather high. When the number of native calls is reduced properly, a performance comparable to C/C++ can be achieved.

From that era is your code. Now let's rewrite it not using RandomAccessFile but rather java.nio shall we?

I have some nio2 code we can pit against C. Garbage collection can be ruled out :)

like image 41
Drew Avatar answered Jan 18 '23 15:01

Drew