For a lot of utility-type methods, the apache commons libraries have solid implementations that you can either leverage or get additional insight from. In this case, there is a method for finding the smallest of three doubles available in org.apache.commons.lang.math.NumberUtils. Their implementation is actually nearly identical to your initial thought:
public static double min(double a, double b, double c) {
return Math.min(Math.min(a, b), c);
}
No, it's seriously not worth changing. The sort of improvements you're going to get when fiddling with micro-optimisations like this will not be worth it. Even the method call cost will be removed if the min
function is called enough.
If you have a problem with your algorithm, your best bet is to look into macro-optimisations ("big picture" stuff like algorithm selection or tuning) - you'll generally get much better performance improvements there.
And your comment that removing Math.pow
gave improvements may well be correct but that's because it's a relatively expensive operation. Math.min
will not even be close to that in terms of cost.
double smallest = a;
if (smallest > b) smallest = b;
if (smallest > c) smallest = c;
Not necessarily faster than your code.
Let me first repeat what others already said, by quoting from the article "Structured Programming with go to Statements" by Donald Knuth:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.
(emphasis by me)
So if you have identified that a seemingly trivial operation like the computation of the minimum of three numbers is the actual bottleneck (that is, the "critical 3%") in your application, then you may consider optimizing it.
And in this case, this is actually possible: The Math#min(double,double)
method in Java has very special semantics:
Returns the smaller of two double values. That is, the result is the value closer to negative infinity. If the arguments have the same value, the result is that same value. If either value is NaN, then the result is NaN. Unlike the numerical comparison operators, this method considers negative zero to be strictly smaller than positive zero. If one argument is positive zero and the other is negative zero, the result is negative zero.
One can have a look at the implementation, and see that it's actually rather complex:
public static double min(double a, double b) {
if (a != a)
return a; // a is NaN
if ((a == 0.0d) &&
(b == 0.0d) &&
(Double.doubleToRawLongBits(b) == negativeZeroDoubleBits)) {
// Raw conversion ok since NaN can't map to -0.0.
return b;
}
return (a <= b) ? a : b;
}
Now, it may be important to point out that this behavior is different from a simple comparison. This can easily be examined with the following example:
public class MinExample
{
public static void main(String[] args)
{
test(0.0, 1.0);
test(1.0, 0.0);
test(-0.0, 0.0);
test(Double.NaN, 1.0);
test(1.0, Double.NaN);
}
private static void test(double a, double b)
{
double minA = Math.min(a, b);
double minB = a < b ? a : b;
System.out.println("a: "+a);
System.out.println("b: "+b);
System.out.println("minA "+minA);
System.out.println("minB "+minB);
if (Double.doubleToRawLongBits(minA) !=
Double.doubleToRawLongBits(minB))
{
System.out.println(" -> Different results!");
}
System.out.println();
}
}
However: If the treatment of NaN
and positive/negative zero is not relevant for your application, you can replace the solution that is based on Math.min
with a solution that is based on a simple comparison, and see whether it makes a difference.
This will, of course, be application dependent. Here is a simple, artificial microbenchmark (to be taken with a grain of salt!)
import java.util.Random;
public class MinPerformance
{
public static void main(String[] args)
{
bench();
}
private static void bench()
{
int runs = 1000;
for (int size=10000; size<=100000; size+=10000)
{
Random random = new Random(0);
double data[] = new double[size];
for (int i=0; i<size; i++)
{
data[i] = random.nextDouble();
}
benchA(data, runs);
benchB(data, runs);
}
}
private static void benchA(double data[], int runs)
{
long before = System.nanoTime();
double sum = 0;
for (int r=0; r<runs; r++)
{
for (int i=0; i<data.length-3; i++)
{
sum += minA(data[i], data[i+1], data[i+2]);
}
}
long after = System.nanoTime();
System.out.println("A: length "+data.length+", time "+(after-before)/1e6+", result "+sum);
}
private static void benchB(double data[], int runs)
{
long before = System.nanoTime();
double sum = 0;
for (int r=0; r<runs; r++)
{
for (int i=0; i<data.length-3; i++)
{
sum += minB(data[i], data[i+1], data[i+2]);
}
}
long after = System.nanoTime();
System.out.println("B: length "+data.length+", time "+(after-before)/1e6+", result "+sum);
}
private static double minA(double a, double b, double c)
{
return Math.min(a, Math.min(b, c));
}
private static double minB(double a, double b, double c)
{
if (a < b)
{
if (a < c)
{
return a;
}
return c;
}
if (b < c)
{
return b;
}
return c;
}
}
(Disclaimer: Microbenchmarking in Java is an art, and for more reliable results, one should consider using JMH or Caliper).
Running this with JRE 1.8.0_31 may result in something like
....
A: length 90000, time 545.929078, result 2.247805342620906E7
B: length 90000, time 441.999193, result 2.247805342620906E7
A: length 100000, time 608.046928, result 2.5032781001456387E7
B: length 100000, time 493.747898, result 2.5032781001456387E7
This at least suggests that it might be possible to squeeze out a few percent here (again, in a very artifical example).
Analyzing this further, by looking at the hotspot disassembly output created with
java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintAssembly MinPerformance
one can see the optimized versions of both methods, minA
and minB
.
First, the output for the method that uses Math.min
:
Decoding compiled method 0x0000000002992310:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} {0x000000001c010910} 'minA' '(DDD)D' in 'MinPerformance'
# parm0: xmm0:xmm0 = double
# parm1: xmm1:xmm1 = double
# parm2: xmm2:xmm2 = double
# [sp+0x60] (sp of caller)
0x0000000002992480: mov %eax,-0x6000(%rsp)
0x0000000002992487: push %rbp
0x0000000002992488: sub $0x50,%rsp
0x000000000299248c: movabs $0x1c010cd0,%rsi
0x0000000002992496: mov 0x8(%rsi),%edi
0x0000000002992499: add $0x8,%edi
0x000000000299249c: mov %edi,0x8(%rsi)
0x000000000299249f: movabs $0x1c010908,%rsi ; {metadata({method} {0x000000001c010910} 'minA' '(DDD)D' in 'MinPerformance')}
0x00000000029924a9: and $0x3ff8,%edi
0x00000000029924af: cmp $0x0,%edi
0x00000000029924b2: je 0x00000000029924e8 ;*dload_0
; - MinPerformance::minA@0 (line 58)
0x00000000029924b8: vmovsd %xmm0,0x38(%rsp)
0x00000000029924be: vmovapd %xmm1,%xmm0
0x00000000029924c2: vmovapd %xmm2,%xmm1 ;*invokestatic min
; - MinPerformance::minA@4 (line 58)
0x00000000029924c6: nop
0x00000000029924c7: callq 0x00000000028c6360 ; OopMap{off=76}
;*invokestatic min
; - MinPerformance::minA@4 (line 58)
; {static_call}
0x00000000029924cc: vmovapd %xmm0,%xmm1 ;*invokestatic min
; - MinPerformance::minA@4 (line 58)
0x00000000029924d0: vmovsd 0x38(%rsp),%xmm0 ;*invokestatic min
; - MinPerformance::minA@7 (line 58)
0x00000000029924d6: nop
0x00000000029924d7: callq 0x00000000028c6360 ; OopMap{off=92}
;*invokestatic min
; - MinPerformance::minA@7 (line 58)
; {static_call}
0x00000000029924dc: add $0x50,%rsp
0x00000000029924e0: pop %rbp
0x00000000029924e1: test %eax,-0x27623e7(%rip) # 0x0000000000230100
; {poll_return}
0x00000000029924e7: retq
0x00000000029924e8: mov %rsi,0x8(%rsp)
0x00000000029924ed: movq $0xffffffffffffffff,(%rsp)
0x00000000029924f5: callq 0x000000000297e260 ; OopMap{off=122}
;*synchronization entry
; - MinPerformance::minA@-1 (line 58)
; {runtime_call}
0x00000000029924fa: jmp 0x00000000029924b8
0x00000000029924fc: nop
0x00000000029924fd: nop
0x00000000029924fe: mov 0x298(%r15),%rax
0x0000000002992505: movabs $0x0,%r10
0x000000000299250f: mov %r10,0x298(%r15)
0x0000000002992516: movabs $0x0,%r10
0x0000000002992520: mov %r10,0x2a0(%r15)
0x0000000002992527: add $0x50,%rsp
0x000000000299252b: pop %rbp
0x000000000299252c: jmpq 0x00000000028ec620 ; {runtime_call}
0x0000000002992531: hlt
0x0000000002992532: hlt
0x0000000002992533: hlt
0x0000000002992534: hlt
0x0000000002992535: hlt
0x0000000002992536: hlt
0x0000000002992537: hlt
0x0000000002992538: hlt
0x0000000002992539: hlt
0x000000000299253a: hlt
0x000000000299253b: hlt
0x000000000299253c: hlt
0x000000000299253d: hlt
0x000000000299253e: hlt
0x000000000299253f: hlt
[Stub Code]
0x0000000002992540: nop ; {no_reloc}
0x0000000002992541: nop
0x0000000002992542: nop
0x0000000002992543: nop
0x0000000002992544: nop
0x0000000002992545: movabs $0x0,%rbx ; {static_stub}
0x000000000299254f: jmpq 0x000000000299254f ; {runtime_call}
0x0000000002992554: nop
0x0000000002992555: movabs $0x0,%rbx ; {static_stub}
0x000000000299255f: jmpq 0x000000000299255f ; {runtime_call}
[Exception Handler]
0x0000000002992564: callq 0x000000000297b9e0 ; {runtime_call}
0x0000000002992569: mov %rsp,-0x28(%rsp)
0x000000000299256e: sub $0x80,%rsp
0x0000000002992575: mov %rax,0x78(%rsp)
0x000000000299257a: mov %rcx,0x70(%rsp)
0x000000000299257f: mov %rdx,0x68(%rsp)
0x0000000002992584: mov %rbx,0x60(%rsp)
0x0000000002992589: mov %rbp,0x50(%rsp)
0x000000000299258e: mov %rsi,0x48(%rsp)
0x0000000002992593: mov %rdi,0x40(%rsp)
0x0000000002992598: mov %r8,0x38(%rsp)
0x000000000299259d: mov %r9,0x30(%rsp)
0x00000000029925a2: mov %r10,0x28(%rsp)
0x00000000029925a7: mov %r11,0x20(%rsp)
0x00000000029925ac: mov %r12,0x18(%rsp)
0x00000000029925b1: mov %r13,0x10(%rsp)
0x00000000029925b6: mov %r14,0x8(%rsp)
0x00000000029925bb: mov %r15,(%rsp)
0x00000000029925bf: movabs $0x515db148,%rcx ; {external_word}
0x00000000029925c9: movabs $0x2992569,%rdx ; {internal_word}
0x00000000029925d3: mov %rsp,%r8
0x00000000029925d6: and $0xfffffffffffffff0,%rsp
0x00000000029925da: callq 0x00000000512a9020 ; {runtime_call}
0x00000000029925df: hlt
[Deopt Handler Code]
0x00000000029925e0: movabs $0x29925e0,%r10 ; {section_word}
0x00000000029925ea: push %r10
0x00000000029925ec: jmpq 0x00000000028c7340 ; {runtime_call}
0x00000000029925f1: hlt
0x00000000029925f2: hlt
0x00000000029925f3: hlt
0x00000000029925f4: hlt
0x00000000029925f5: hlt
0x00000000029925f6: hlt
0x00000000029925f7: hlt
One can see that the treatment of special cases involves some effort - compared to the output that uses simple comparisons, which is rather straightforward:
Decoding compiled method 0x0000000002998790:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} {0x000000001c0109c0} 'minB' '(DDD)D' in 'MinPerformance'
# parm0: xmm0:xmm0 = double
# parm1: xmm1:xmm1 = double
# parm2: xmm2:xmm2 = double
# [sp+0x20] (sp of caller)
0x00000000029988c0: sub $0x18,%rsp
0x00000000029988c7: mov %rbp,0x10(%rsp) ;*synchronization entry
; - MinPerformance::minB@-1 (line 63)
0x00000000029988cc: vucomisd %xmm0,%xmm1
0x00000000029988d0: ja 0x00000000029988ee ;*ifge
; - MinPerformance::minB@3 (line 63)
0x00000000029988d2: vucomisd %xmm1,%xmm2
0x00000000029988d6: ja 0x00000000029988de ;*ifge
; - MinPerformance::minB@22 (line 71)
0x00000000029988d8: vmovapd %xmm2,%xmm0
0x00000000029988dc: jmp 0x00000000029988e2
0x00000000029988de: vmovapd %xmm1,%xmm0 ;*synchronization entry
; - MinPerformance::minB@-1 (line 63)
0x00000000029988e2: add $0x10,%rsp
0x00000000029988e6: pop %rbp
0x00000000029988e7: test %eax,-0x27688ed(%rip) # 0x0000000000230000
; {poll_return}
0x00000000029988ed: retq
0x00000000029988ee: vucomisd %xmm0,%xmm2
0x00000000029988f2: ja 0x00000000029988e2 ;*ifge
; - MinPerformance::minB@10 (line 65)
0x00000000029988f4: vmovapd %xmm2,%xmm0
0x00000000029988f8: jmp 0x00000000029988e2
0x00000000029988fa: hlt
0x00000000029988fb: hlt
0x00000000029988fc: hlt
0x00000000029988fd: hlt
0x00000000029988fe: hlt
0x00000000029988ff: hlt
[Exception Handler]
[Stub Code]
0x0000000002998900: jmpq 0x00000000028ec920 ; {no_reloc}
[Deopt Handler Code]
0x0000000002998905: callq 0x000000000299890a
0x000000000299890a: subq $0x5,(%rsp)
0x000000000299890f: jmpq 0x00000000028c7340 ; {runtime_call}
0x0000000002998914: hlt
0x0000000002998915: hlt
0x0000000002998916: hlt
0x0000000002998917: hlt
Whether or not there are cases where such an optimization really makes a difference in an application is hard to tell. But at least, the bottom line is:
Math#min(double,double)
method is not the same as a simple comparison, and the treatment of the special cases does not come for freeMath#min
is not necessary, and then a comparison-based approach may be more efficientmin(double,double,double)
anyhow, for better convenience and readability, and then it would be easy to do two runs with the different implementations, and see whether it really affects the performance.(Side note: The integer type methods, like Math.min(int,int)
actually are a simple comparison - so I would expect no difference for these).
You can use ternary operator as follows:
smallest=(a<b)?((a<c)?a:c):((b<c)?b:c);
Which takes only one assignment and minimum two comparisons.
But I think that these statements would not have any effect on execution time, your initial logic will take same time as of mine and all of others.
OP's efficient code has a bug:
when a == b
, and a (or b) < c
, the code will pick c instead of a or b.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With