Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

System.arrayCopy is slow

I've been trying to measure the performance of the System.arrayCopy vs Arrays.copyOf in order to choose properly one of them. Just for the sake of benchmark I added manual copy as well and the result surprised me. Obviously I'm missing something really important, could you, please, tell me, what it is? The implementation is as follows (see first 4 methods).

public class ArrayCopy {

    public static int[] createArray( int size ) {
        int[] array = new int[size];
        Random r = new Random();
        for ( int i = 0; i < size; i++ ) {
            array[i] = r.nextInt();
        }
        return array;
    }

    public static int[] copyByArraysCopyOf( int[] array, int size ) {
        return Arrays.copyOf( array, array.length + size );
    }

    public static int[] copyByEnlarge( int[] array, int size ) {
        return enlarge( array, size );
    }

    public static int[] copyManually( int[] array, int size ) {
        int[] newArray = new int[array.length + size];
        for ( int i = 0; i < array.length; i++ ) {
            newArray[i] = array[i];
        }
        return newArray;
    }

    private static void copyArray( int[] source, int[] target ) {
        System.arraycopy( source, 0, target, 0, Math.min( source.length, target.length ) );
    }

    private static int[] enlarge( int[] orig, int size ) {
        int[] newArray = new int[orig.length + size];
        copyArray( orig, newArray );
        return newArray;
    }

    public static void main( String... args ) {
        int[] array = createArray( 1000000 );
        int runs = 1000;
        int size = 1000000;
        System.out.println( "****************** warm up #1 ******************" );
        warmup( ArrayCopy::copyByArraysCopyOf, array, size, runs );
        warmup( ArrayCopy::copyByEnlarge, array, size, runs );
        warmup( ArrayCopy::copyManually, array, size, runs );
        System.out.println( "****************** warm up #2 ******************" );
        warmup( ArrayCopy::copyByArraysCopyOf, array, size, runs );
        warmup( ArrayCopy::copyByEnlarge, array, size, runs );
        warmup( ArrayCopy::copyManually, array, size, runs );
        System.out.println( "********************* test *********************" );
        System.out.print( "copyByArrayCopyOf" );
        runTest( ArrayCopy::copyByArraysCopyOf, array, size, runs );
        System.out.print( "copyByEnlarge" );
        runTest( ArrayCopy::copyByEnlarge, array, size, runs );
        System.out.print( "copyManually" );
        runTest( ArrayCopy::copyManually, array, size, runs );
    }

    private static void warmup( BiConsumer<int[], Integer> consumer, int[] array, int size, int runs ) {
        for ( int i = 0; i < runs; i++ ) {
            consumer.accept( array, size );
        }
    }

    private static void runTest( BiConsumer<int[], Integer> consumer, int[] array, int size, int runs ) {
        ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
        long currentCpuTime = threadMXBean.getCurrentThreadCpuTime();
        long nanoTime = System.nanoTime();
        for ( int i = 0; i < runs; i++ ) {
            consumer.accept( array, size );
        }
        System.out.println( "-time = " + ( ( System.nanoTime() - nanoTime ) / 10E6 ) + " ms. CPU time = " + ( ( threadMXBean.getCurrentThreadCpuTime() - currentCpuTime ) / 10E6 ) + " ms" );
    }
}

The result shows that manual copy performed around 30% better, as shown below:

****************** warm up #1 ******************
****************** warm up #2 ******************
********************* test *********************
copyByArrayCopyOf-time = 162.470107 ms. CPU time = 153.125 ms
copyByEnlarge-time = 168.6757949 ms. CPU time = 164.0625 ms
copyManually-time = 116.3975962 ms. CPU time = 110.9375 ms

I'm really confused, because I thought (and probably I still do) that System.arrayCopy due to its nativity is the best possible way to copy an array, but I cannot explain this result.

like image 995
Michael Bláha Avatar asked Oct 27 '16 13:10

Michael Bláha


People also ask

How does arraycopy work in Java 2?

2. Performance of System.arraycopy () System.arraycopy () copies the array contents from the source array, beginning at the specified position, to the designated position in the destination array. Additionally, before copying, the JVM checks that both source and destination types are the same.

What is the best way to copy an array by length?

For example, in ArrayList, System.arraycopy () is always used, never "element by element copy", regardless of length (even if it's 0). Since ArrayList is very performance conscious, we can derive that System.arraycopy () is the most effecient way of array copying regardless of length.

What is the difference between manual and manual copy of arraycopy?

When copySize = 24, System.arraycopy () and the manual loop take almost the same time (sometimes one is very slightly faster than the other, other times it’s the contrary), When copySize < 24, the manual loop is faster than System.arraycopy () (slightly faster with copySize = 23, really faster with copySize < 5),

What is the worst-case scenario for copying a large array of memory?

Moreover, their complexities can differ between platforms. We can be sure that the worst-case scenario is O (N). However, the processor can copy contiguous blocks of memory one block at a time ( memcpy () in C), so actual results can be better. 3. Performance of Arrays.copyOf ()


2 Answers

Actually, HotSpot compiler is smart enough to unroll and vectorize manual copy loop - that's why the result code appears to be well optimized.

Why is System.arraycopy slower then? It is originally a native method, and you have to pay for a native call until the compiler optimizes it as JVM intrinsic.

However, in your test the compiler does not have a chance for such optimization, because enlarge method is not called many enough times (i.e. it is not considered as hot).

I'll show you a funny trick to force the optimization. Rewrite enlarge method as follows:

private static int[] enlarge(int[] array, int size) {
    for (int i = 0; i < 10000; i++) { /* fool the JIT */ }

    int[] newArray = new int[array.length + size];
    System.arraycopy(array, 0, newArray, 0, array.length);
    return newArray;
}

An empty loop triggers a backedge counter overflow, which in turn triggers the compilation of enlarge method. The empty loop is then eliminated from the compiled code, so it is harmless. Now enlarge method gets about 1.5x faster than the manual loop!

It is important that System.arraycopy immediately follows new int[]. In this case HotSpot can optimize away the redundant zeroing of newly allocated array. You know, all Java objects must be zeroed right after creation. But as far as compiler detects that the array is filled right after creation, it may eliminate zeroing, thus making the result code yet faster.

P.S. @assylias' benchmark is good, but it also suffers from the fact that System.arraycopy is not intrinsified for the large arrays. In case of small arrays arrayCopy benchmark is called many times per second, JIT considers it hot and optimizes well. But for large arrays each iteration is longer, so there is a lot less iterations per second, and JIT does not treat arrayCopy as hot.

like image 89
apangin Avatar answered Sep 21 '22 06:09

apangin


Using jmh, I get the results shown in the table below (size is the size of the array, score is the time in microseconds, error shows the confidence interval at 99.9%):

Benchmark              (size)  Mode  Cnt      Score     Error  Units
ArrayCopy.arrayCopy        10  avgt   60      0.022 ±   0.001  us/op
ArrayCopy.arrayCopy     10000  avgt   60      4.959 ±   0.068  us/op
ArrayCopy.arrayCopy  10000000  avgt   60  11906.870 ± 220.850  us/op
ArrayCopy.clone_           10  avgt   60      0.022 ±   0.001  us/op
ArrayCopy.clone_        10000  avgt   60      4.956 ±   0.068  us/op
ArrayCopy.clone_     10000000  avgt   60  10895.856 ± 208.369  us/op
ArrayCopy.copyOf           10  avgt   60      0.022 ±   0.001  us/op
ArrayCopy.copyOf        10000  avgt   60      4.958 ±   0.072  us/op
ArrayCopy.copyOf     10000000  avgt   60  11837.139 ± 220.452  us/op
ArrayCopy.loop             10  avgt   60      0.036 ±   0.001  us/op
ArrayCopy.loop          10000  avgt   60      5.872 ±   0.095  us/op
ArrayCopy.loop       10000000  avgt   60  11315.482 ± 217.348  us/op

In substance, loop seems to perform slightly better than arrayCopy for large arrays indeed - probably because the JIT is quite good at optimising such a simple loop. For smaller arrays, arrayCopy seems better (although the difference is quite small).

Note however that clone seems to be consistently as good as, or better than, the other options, depending on size. So I would go for clone, which also happens to be easier to use.


For reference, benchmark code, run with -wi 5 -w 1000ms -i 30 -r 1000ms -t 1 -f 2 -tu us:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
public class ArrayCopy {

  @Param({"10", "10000", "10000000"}) int size;

  private int[] array;

  @Setup(Level.Invocation) public void setup() {
    array = new int[size];
    for (int i = 0; i < size; i++) {
      array[i] = i;
    }
  }

  @Benchmark
  public int[] clone_() {
    int[] copy = array.clone();
    return copy;
  }

  @Benchmark
  public int[] arrayCopy() {
    int[] copy = new int[array.length];
    System.arraycopy(array, 0, copy, 0, array.length);
    return copy;
  }

  @Benchmark
  public int[] copyOf() {
    int[] copy = Arrays.copyOf(array, array.length);
    return copy;
  }

  @Benchmark
  public int[] loop() {
    int[] copy = new int[array.length];
    for (int i = 0; i < array.length; i++) {
      copy[i] = array[i];
    }
    return copy;
  }
}
like image 24
assylias Avatar answered Sep 20 '22 06:09

assylias