I know that making judgements of Java microbenchmarks is extremely fraught, but I'm seeing something that seems odd, and I'd like to get some explanation for it.
Note that I'm not using the JMH framework for this. I'm aware of it, but I didn't want to go to that length for this.
I'll provide the entire code sample, but in short, when I test the performance of these two methods
private FooPrime[] testStreamToArray(ArrayList<Foo> fooList) {
return (FooPrime[]) fooList.stream().
map(it -> {
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).
toArray(FooPrime[]::new);
}
private FooPrime[] testStreamToArray2(ArrayList<Foo> fooList) {
return (FooPrime[]) fooList.stream().
map(it -> {
int stuff = it.getAlpha().length();
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).
toArray(FooPrime[]::new);
}
I find very surprising results. In the larger code sample, I'm measuring four different ways of doing this, and the first three are very close in performance. They all run about 50k ns per iteration. However, the second code sample consistently runs just under half of that total. That's right. It's not slower, it's quite a bit faster.
The last run shows numbers like this:
manualcopy:54575 ns
toarray:53617 ns
streamtoarray:52990 ns
streamtoarray2:24217 ns
Each run has numbers similar to these.
I'll now provide the entire class and base class. Note that I do have a "warm-up" pass, where I execute the methods under test a few thousand times before starting the timings. Also note that although this runs "testStreamToArray2" last, I also tried moving that block to the first test, and the numbers come out about the same. The commented out lines are there to convince me that the methods are actually doing something (the timings are still about the same with those lines not commented out).
package timings;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class ListToArrayOfPrimesTiming {
public static void main(String[] args) {
ListToArrayOfPrimesTiming tests = new ListToArrayOfPrimesTiming(args);
tests.go();
}
public ListToArrayOfPrimesTiming(String[] args) { }
private void go() {
final ArrayList<Foo> fooList = new ArrayList<>();
for (int ctr = 0; ctr < 1000; ++ ctr) {
fooList.add(new Foo().alpha("a" + ctr).beta("b" + ctr));
}
for (int ctr = 0; ctr < 20000; ++ ctr) {
testManualCopy(fooList);
testToArray(fooList);
testStreamToArray(fooList);
testStreamToArray2(fooList);
}
int iters = 100000;
// Set<Integer> lengths = new HashSet<>();
// Set<FooPrime> distinctFooPrimes = new HashSet<>();
// lengths.clear();
// distinctFooPrimes.clear();
new TimingContainer(iters, "manualcopy", new TimingTest() {
@Override
public void run() {
FooPrime[] fooPrimeArray = testManualCopy(fooList);
// lengths.add(fooPrimeArray.length);
// distinctFooPrimes.add(fooPrimeArray[0]);
}
}).run();
// System.out.println("lengths[" + lengths + "]");
// lengths.clear();
// System.out.println("distinctFooPrimes[" + distinctFooPrimes + "]");
// distinctFooPrimes.clear();
new TimingContainer(iters, "toarray", new TimingTest() {
@Override
public void run() {
FooPrime[] fooPrimeArray = testManualCopy(fooList);
// lengths.add(fooPrimeArray.length);
// distinctFooPrimes.add(fooPrimeArray[0]);
}
}).run();
// System.out.println("lengths[" + lengths + "]");
// lengths.clear();
// System.out.println("distinctFooPrimes[" + distinctFooPrimes + "]");
// distinctFooPrimes.clear();
new TimingContainer(iters, "streamtoarray", new TimingTest() {
@Override
public void run() {
FooPrime[] fooPrimeArray = testStreamToArray(fooList);
// lengths.add(fooPrimeArray.length);
// distinctFooPrimes.add(fooPrimeArray[0]);
}
}).run();
// System.out.println("lengths[" + lengths + "]");
// lengths.clear();
// System.out.println("distinctFooPrimes[" + distinctFooPrimes + "]");
// distinctFooPrimes.clear();
new TimingContainer(iters, "streamtoarray2", new TimingTest() {
@Override
public void run() {
FooPrime[] fooPrimeArray = testStreamToArray2(fooList);
// lengths.add(fooPrimeArray.length);
// distinctFooPrimes.add(fooPrimeArray[0]);
}
}).run();
// System.out.println("lengths[" + lengths + "]");
// lengths.clear();
// System.out.println("distinctFooPrimes[" + distinctFooPrimes + "]");
// distinctFooPrimes.clear();
}
private FooPrime[] testManualCopy(ArrayList<Foo> fooList) {
FooPrime[] fooPrimeArray = new FooPrime[fooList.size()];
int index = -1;
for (Foo foo: fooList) {
++ index;
fooPrimeArray[index] = new FooPrime().gamma(foo.getAlpha() + foo.getBeta());
}
return fooPrimeArray;
}
private FooPrime[] testToArray(ArrayList<Foo> fooList) {
List<FooPrime> fooPrimeList = new ArrayList<>();
for (Foo foo: fooList) {
fooPrimeList.add(new FooPrime().gamma(foo.getAlpha() + foo.getBeta()));
}
return fooPrimeList.toArray(new FooPrime[fooList.size()]);
}
private FooPrime[] testStreamToArray(ArrayList<Foo> fooList) {
return (FooPrime[]) fooList.stream().
map(it -> {
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).
toArray(FooPrime[]::new);
}
private FooPrime[] testStreamToArray2(ArrayList<Foo> fooList) {
return (FooPrime[]) fooList.stream().
map(it -> {
int stuff = it.getAlpha().length();
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).
toArray(FooPrime[]::new);
}
public static FooPrime fooToFooPrime(Foo foo) {
return new FooPrime().gamma(foo.getAlpha() + foo.getBeta());
}
public static class Foo {
private String alpha;
private String beta;
public String getAlpha() { return alpha; }
public String getBeta() { return beta; }
public void setAlpha(String alpha) { this.alpha = alpha; }
public void setBeta(String beta) { this.beta = beta; }
public Foo alpha(String alpha) { this.alpha = alpha; return this; }
public Foo beta(String beta) { this.beta = beta; return this; }
}
public static class FooPrime {
private String gamma;
public String getGamma() { return gamma; }
public void setGamma(String gamma) { this.gamma = gamma; }
public FooPrime gamma(String gamma) { this.gamma = gamma; return this; }
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((gamma == null) ? 0 : gamma.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
FooPrime other = (FooPrime) obj;
if (gamma == null) {
if (other.gamma != null)
return false;
} else if (!gamma.equals(other.gamma))
return false;
return true;
}
@Override
public String toString() {
return "FooPrime [gamma=" + gamma + "]";
}
}
}
And the base class:
package timings;
public class TimingContainer {
private int iterations;
private String label;
private TimingTest timingTest;
public TimingContainer(int iterations, String label, TimingTest timingTest) {
this.iterations = iterations;
this.label = label;
this.timingTest = timingTest;
}
public void run() {
long startTime = System.nanoTime();
for (int ctr = 0; ctr < iterations; ++ ctr) {
timingTest.randomize();
timingTest.run();
}
long endTime = System.nanoTime();
long totalns = (endTime - startTime);
System.out.println(label + ":" + (totalns / iterations) + " ns");
}
}
(Revised answer.)
Benchmarking in Java is difficult. Still, let us throw JMH at it... I ported your benchmark to JMH (see http://github.com/lemire/microbenchmarks).
These are the relevant methods...
public FooPrime[] basicstream(BenchmarkState s) {
return (FooPrime[]) s.fooList.stream().map(it -> {
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).toArray(FooPrime[]::new);
}
public FooPrime[] tweakedbasicstream(BenchmarkState s) {
return (FooPrime[]) s.fooList.stream().map(it -> {
int stuff = it.getAlpha().length();
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).toArray(FooPrime[]::new);
}
And here is the result of my run...
git clone https://github.com/lemire/microbenchmarks.git
cd microbenchmarks
mvn clean install
java -cp target/microbenchmarks-0.0.1-jar-with-dependencies.jar me.lemire.microbenchmarks.mysteries.MysteriousLambda
Benchmark Mode Samples Score Error Units
m.l.m.m.MysteriousLambda.basicstream avgt 5 17013.784 ± 46.536 ns/op
m.l.m.m.MysteriousLambda.tweakedbasicstream avgt 5 16240.451 ± 67.884 ns/op
Oddly, it seems that the two functions do not run at exactly the same average speed, there is a fairly significant difference. And that's while using JMH, a fairly good framework for benchmarking.
I thought at first that your two pieces of code were logically equivalent, but they are not. The apparently useless length method access forces the code to throw an exception when the String object returned is null.
So it is actually closer to the following piece of code...
@Benchmark
public FooPrime[] nullbasicstream(BenchmarkState s) {
return (FooPrime[]) s.fooList.stream().map(it -> {
if( it.getAlpha() == null) throw new NullPointerException();
return new FooPrime().gamma(it.getAlpha() + it.getBeta());
}).toArray(FooPrime[]::new);
}
And this is even faster than your tweaked function...
Benchmark Mode Samples Score Error Units
m.l.m.m.MysteriousLambda.basicstream avgt 5 17013.784 ± 46.536 ns/op
m.l.m.m.MysteriousLambda.nullbasicstream avgt 5 15983.762 ± 92.593 ns/op
m.l.m.m.MysteriousLambda.tweakedbasicstream avgt 5 16240.451 ± 67.884 ns/op
Why might this be?
Let us sidestep Java 8's stream programming and write the function the silly old way, with and without the null check:
@Benchmark
public FooPrime[] basicsum(BenchmarkState s) {
int howmany = s.fooList.size();
FooPrime[] answer = new FooPrime[s.fooList.size()];
for(int k = 0; k < howmany ; ++k ) {
Foo x = s.fooList.get(k);
answer[k] = new FooPrime(x.getAlpha() + x.getBeta());
}
return answer;
}
@Benchmark
public FooPrime[] basicsumnull(BenchmarkState s) {
int howmany = s.fooList.size();
FooPrime[] answer = new FooPrime[s.fooList.size()];
for(int k = 0; k < howmany ; ++k ) {
Foo x = s.fooList.get(k);
if(x.getAlpha() == null) throw new NullPointerException();
answer[k] = new FooPrime(x.getAlpha() + x.getBeta());
}
return answer;
}
And that is how we get the best performance...
m.l.m.m.MysteriousLambda.basicstream avgt 5 17019.730 ± 61.982 ns/op
m.l.m.m.MysteriousLambda.nullbasicstream avgt 5 16019.332 ± 62.831 ns/op
m.l.m.m.MysteriousLambda.basicsum avgt 5 15635.474 ± 119.890 ns/op
m.l.m.m.MysteriousLambda.basicsumnull avgt 5 14342.016 ± 109.958 ns/op
But the benefit of the null check remains.
Ok. Let us benchmark just the string sums, without anything else (no custom class). Let us have both the standard sum and the sum preceeded by a null check:
@Benchmark
public void stringsum(BenchmarkState s) {
for(int k = 0; k < s.N; ++k) s.list3[k] = s.list1[k] + s.list2[k];
}
@Benchmark
public void stringsum_withexcept(BenchmarkState s) {
for(int k = 0; k < s.N; ++k) {
if(s.list1[k] == null) throw new NullPointerException();
s.list3[k] = s.list1[k] + s.list2[k];
}
}
We get that the null check slows us down...
m.l.m.m.StringMerge.stringsum avgt 5 27011.111 ± 4.077 ns/op
m.l.m.m.StringMerge.stringsum_withexcept avgt 5 28387.825 ± 82.523 ns/op
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With