Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java VM memory performance - Are Array writes faster than Array reads?

I performed a short benchmark on a long array in java with quite strange results. It seems that sequential reads with random writes are faster - half the time - than random reads with sequential writes. Has anyone a clue why??

Here are two methods that write an array of some longs (run with -Xmx2G or so) by random when reading sequentially and read sequentially when writing by random:

import java.util.Random;


public class Scratch {
static Random random = new Random();
static long[] arr = new long[100000000];

static void seqReadRandWrite() {
    for(int i=0;i<arr.length;i++) {
        int at = random.nextInt(arr.length);
        arr[at] = arr[i];
    }
}

static void seqWriteRandRead() {
    for(int i=0;i<arr.length;i++) {
        int at = random.nextInt(arr.length);
        arr[i] = arr[at];
    }
}

public static void main(String[] args) throws Exception {

    seqWriteRandRead(); // warm up

    long nanos = System.nanoTime();
    seqReadRandWrite();
    System.out.println("Time: " + (System.nanoTime()-nanos) + "ns");

    nanos = System.nanoTime();
    seqWriteRandRead();
    System.out.println("Time: " + (System.nanoTime()-nanos) + "ns");

}
}

the results on my notebook are

Time: 2774662168ns

Time: 6059499068ns

Which means its twice as fast to write by random compared to read .. or? Is my notebook broken?

ps.: this does not claim to be a benchmark, although most of the points in the linked advices about benchmarking are covered. Even if I run the already 200,000,000 operations multiple times, the resuts stay quite constant. It seems (seems!) that moving memory from random positions into sequential blocks is slower than moving memory from sequential positions into random blocks at least with memory of this size and the above way of doing it. and I wonder why?

like image 977
cybye Avatar asked Jan 31 '13 22:01

cybye


3 Answers

Your benchmark is producing numbers that fail the "Do they make sense?" test. In such a situation, you should always double / triple / quadruple check your methodology ... BEFORE treat the numbers as a true reflection of reality.

Writing reliable benchmarks is hard. And in the case of Java, it is particularly hard, because some aspects of the Java platform can introduce systematic distortions into your benchmark measurements ... unless you specifically allow / compensate for them.

But the "check your methodology" rule applies to ALL experiments ... especially ones that produce results that don't seem to make sense. (Like neutrinos travelling faster than light ...)


The other thing to note is that once you've rewritten the benchmark to account for the confounding factors, you may still see unexpected numbers. The issue here is that performance of benchmarks like this are likely to be sensitive to things like the size of L1 and L2 caches, size of cache lines, relative speeds of the different levels of memory ... and their interplay with the exact sequences of instructions that the benchmark produce in the tight loops.

These things are complicated, difficult to analyse, and can give counterintuitive behaviour. And it is not surprising (to me) that different machines give different measured performance.

So even if the figures are real, it is still unsafe to draw any general conclusions about the speed of reads versus writes from this benchmark. Not even if you restrict them to just your laptop.

like image 174
Stephen C Avatar answered Oct 30 '22 02:10

Stephen C


I believe this benchmark is totally useless for you. There are a lot of parameters of measurements to consider you did not describe and the way you approach that problem, is completely undescribed. To make any conclusion at all about speed of implementations regarding VMs, Computers, RAM speed, software you process at the same time, kind of Objects or simple stuff you copy, and so on, you must learn about a methodic way. This question is not answereable. You must narrow on which specific circumstances you want to know about speed.

Especially you cannot make any conclusion, when using Random numbers. This increases a lot the problem of best, worst or average case Complexity.

Please check out complexity on algorithms, then go on with searching how to make scientific runtime performance measurements. I hope I could help you a bit.

This first answer is awesome and will help you to understand. How do I write a correct micro-benchmark in Java?

Best regards,

like image 24
Semo Avatar answered Oct 30 '22 00:10

Semo


In summary, the question title is slightly incorrect. Truth seems to be that on some environments (e.g mine and OP's) random array writes are faster then random array reads. But note that this is not true for some other people.

Based on @JustinKSU's comment I separated out read and write and found that random writes are faster then random reads. The results are as follows. This seems to be the reason, and the collective opinion here seems to be that the read-misses on cache are more expensive then write-misses (if there is any caching involved in writes at all).

In production though where there is other activity, hotspot might play a role.

/cygdrive/c/Java/jdk1.7.0/bin/javac.exe Scratch.java && /cygdrive/c/Java/jdk1.7.0/bin/java Scratch
Starting
seqRead: 1273719725ns
seqRead: 1243055271ns
seqRead: 1245022497ns
seqRead: 1242868527ns
seqRead: 1241655611ns
randRead: 6900959912ns
randRead: 6965196004ns
randRead: 7379623094ns
randRead: 7020390995ns
randRead: 6938997617ns
seqWrite: 1266963940ns
seqWrite: 1250599487ns
seqWrite: 1246471685ns
seqWrite: 1230472648ns
seqWrite: 1246975416ns
randWrite: 3898382192ns
randWrite: 3897441137ns
randWrite: 3939947844ns
randWrite: 4207906037ns
randWrite: 4103594207ns

Compilation finished at Thu Jan 31 14:38:57

My modified code is as follows:

import java.util.Random;


public class Scratch {
static Random random = new Random();
static long[] arr = new long[100000000];

static void seqReadRandWrite() {
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        arr[at] = arr[i];
    }
}

static void seqWriteRandRead() {
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        arr[i] = arr[at];
    }
}


static void seqRead() {
    int x = 0;
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        x += arr[i];
    }
}

static void randRead() {
    int x = 0;
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        x += arr[at];
    }
}

static void seqWrite() {
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        arr[i] = at;
    }
}

static void randWrite() {
    for(int i=0;i<arr.length;i++) {
        int at = Math.abs(random.nextInt() % arr.length);
        arr[at] = at;
    }
}


public static void main(String[] args) throws Exception {

    // seqWriteRandRead(); // warm up
    System.out.println("Starting");

    long nanos =  -1;
    /*
    for (int i = 0; i < 5; i++) {       
        nanos = System.nanoTime();
        seqWriteRandRead();
        System.out.println("WriteRandRead Time: " + (System.nanoTime()-nanos) + "ns");

        nanos = System.nanoTime();
        seqReadRandWrite();
        System.out.println("ReadRandWrite Time: " + (System.nanoTime()-nanos) + "ns");
    }
    */

    for (int i = 0; i < 5; i++) {       
        nanos = System.nanoTime();
        seqRead();
        System.out.println("seqRead: " + (System.nanoTime()-nanos) + "ns");
    }

    for (int i = 0; i < 5; i++) {       
        nanos = System.nanoTime();
        randRead();
        System.out.println("randRead: " + (System.nanoTime()-nanos) + "ns");
    }


    for (int i = 0; i < 5; i++) {       
        nanos = System.nanoTime();
        seqWrite();
        System.out.println("seqWrite: " + (System.nanoTime()-nanos) + "ns");
    }

    for (int i = 0; i < 5; i++) {       
        nanos = System.nanoTime();
        randWrite();
        System.out.println("randWrite: " + (System.nanoTime()-nanos) + "ns");
    }

}
}

UPDATE

@tomcarchrae did the same test on Linux, with significantly different results. Below, the first column is numbers from my test and the second are from Tom's:

seqRead:   1273719725ns   2810487542ns  
seqRead:   1243055271ns   2780504580ns  
seqRead:   1245022497ns   2746663894ns  
seqRead:   1242868527ns   2746094469ns  
seqRead:   1241655611ns   2763107970ns  
randRead:  6900959912ns   23093543703ns 
randRead:  6965196004ns   22458781637ns 
randRead:  7379623094ns   24421031646ns 
randRead:  7020390995ns   25880250599ns 
randRead:  6938997617ns   26873823898ns 
seqWrite:  1266963940ns   4226886722ns  
seqWrite:  1250599487ns   4537680602ns  
seqWrite:  1246471685ns   3880372295ns  
seqWrite:  1230472648ns   4160499114ns  
seqWrite:  1246975416ns   4008607447ns  
randWrite: 3898382192ns   25985349107ns 
randWrite: 3897441137ns   22259835568ns 
randWrite: 3939947844ns   22556465742ns 
randWrite: 4207906037ns   22143959163ns 
randWrite: 4103594207ns   21737397817ns 
like image 42
Miserable Variable Avatar answered Oct 30 '22 02:10

Miserable Variable