Java benchmarking - why is the second loop faster?

Question

I'm curious about this.

I wanted to check which function was faster, so I create a little code and I executed a lot of times.

public static void main(String[] args) {          long ts;         String c = "sgfrt34tdfg34";          ts = System.currentTimeMillis();         for (int k = 0; k < 10000000; k++) {             c.getBytes();         }         System.out.println("t1->" + (System.currentTimeMillis() - ts));          ts = System.currentTimeMillis();         for (int i = 0; i < 10000000; i++) {             Bytes.toBytes(c);         }         System.out.println("t2->" + (System.currentTimeMillis() - ts));      }

The "second" loop is faster, so, I thought that Bytes class from hadoop was faster than the function from String class. Then, I changed the order of the loops and then c.getBytes() got faster. I executed many times, and my conclusion was, I don't know why, but something happen in my VM after the first code execute so that the results become faster for the second loop.

Tim B · Accepted Answer

This is a classic java benchmarking issue. Hotspot/JIT/etc will compile your code as you use it, so it gets faster during the run.

Run around the loop at least 3000 times (10000 on a server or on 64 bit) first - then do your measurements.

Sergey Kalinichenko · Answer

You know there's something wrong, because Bytes.toBytes calls c.getBytes internally:

public static byte[] toBytes(String s) {     try {         return s.getBytes(HConstants.UTF8_ENCODING);     } catch (UnsupportedEncodingException e) {         LOG.error("UTF-8 not supported?", e);         return null;     } }

The source is taken from here. This tells you that the call cannot possibly be faster than the direct call - at the very best (i.e. if it gets inlined) it would have the same timing. Generally, though, you'd expect it to be a little slower, because of the small overhead in calling a function.

This is the classic problem with micro-benchmarking in interpreted, garbage-collected environments with components that run at arbitrary time, such as garbage collectors. On top of that, there are hardware optimizations, such as caching, that skew the picture. As the result, the best way to see what is going on is often to look at the source.

Java benchmarking - why is the second loop faster?

Tags:

java

performance

benchmarking

Guille

2 Answers

Tim B

Sergey Kalinichenko

Recent Activity

Donate For Us

Java benchmarking - why is the second loop faster?

Tags:

java

performance

benchmarking

Guille

2 Answers

Tim B

Sergey Kalinichenko

Related questions

Recent Activity

Donate For Us