Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to find smallest of 3 numbers Java?

For a lot of utility-type methods, the apache commons libraries have solid implementations that you can either leverage or get additional insight from. In this case, there is a method for finding the smallest of three doubles available in org.apache.commons.lang.math.NumberUtils. Their implementation is actually nearly identical to your initial thought:

public static double min(double a, double b, double c) {
    return Math.min(Math.min(a, b), c);
}

No, it's seriously not worth changing. The sort of improvements you're going to get when fiddling with micro-optimisations like this will not be worth it. Even the method call cost will be removed if the min function is called enough.

If you have a problem with your algorithm, your best bet is to look into macro-optimisations ("big picture" stuff like algorithm selection or tuning) - you'll generally get much better performance improvements there.

And your comment that removing Math.pow gave improvements may well be correct but that's because it's a relatively expensive operation. Math.min will not even be close to that in terms of cost.


double smallest = a;
if (smallest > b) smallest = b;
if (smallest > c) smallest = c;

Not necessarily faster than your code.


Let me first repeat what others already said, by quoting from the article "Structured Programming with go to Statements" by Donald Knuth:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.

(emphasis by me)

So if you have identified that a seemingly trivial operation like the computation of the minimum of three numbers is the actual bottleneck (that is, the "critical 3%") in your application, then you may consider optimizing it.

And in this case, this is actually possible: The Math#min(double,double) method in Java has very special semantics:

Returns the smaller of two double values. That is, the result is the value closer to negative infinity. If the arguments have the same value, the result is that same value. If either value is NaN, then the result is NaN. Unlike the numerical comparison operators, this method considers negative zero to be strictly smaller than positive zero. If one argument is positive zero and the other is negative zero, the result is negative zero.

One can have a look at the implementation, and see that it's actually rather complex:

public static double min(double a, double b) {
    if (a != a)
        return a;   // a is NaN
    if ((a == 0.0d) &&
        (b == 0.0d) &&
        (Double.doubleToRawLongBits(b) == negativeZeroDoubleBits)) {
        // Raw conversion ok since NaN can't map to -0.0.
        return b;
    }
    return (a <= b) ? a : b;
}

Now, it may be important to point out that this behavior is different from a simple comparison. This can easily be examined with the following example:

public class MinExample
{
    public static void main(String[] args)
    {
        test(0.0, 1.0);
        test(1.0, 0.0);
        test(-0.0, 0.0);
        test(Double.NaN, 1.0);
        test(1.0, Double.NaN);
    }

    private static void test(double a, double b)
    {
        double minA = Math.min(a, b);
        double minB = a < b ? a : b;
        System.out.println("a: "+a);
        System.out.println("b: "+b);
        System.out.println("minA "+minA);
        System.out.println("minB "+minB);
        if (Double.doubleToRawLongBits(minA) !=
            Double.doubleToRawLongBits(minB))
        {
            System.out.println(" -> Different results!");
        }
        System.out.println();
    }
}

However: If the treatment of NaN and positive/negative zero is not relevant for your application, you can replace the solution that is based on Math.min with a solution that is based on a simple comparison, and see whether it makes a difference.

This will, of course, be application dependent. Here is a simple, artificial microbenchmark (to be taken with a grain of salt!)

import java.util.Random;

public class MinPerformance
{
    public static void main(String[] args)
    {
        bench();
    }

    private static void bench()
    {
        int runs = 1000;
        for (int size=10000; size<=100000; size+=10000)
        {
            Random random = new Random(0);
            double data[] = new double[size];
            for (int i=0; i<size; i++)
            {
                data[i] = random.nextDouble();
            }
            benchA(data, runs);
            benchB(data, runs);
        }
    }

    private static void benchA(double data[], int runs)
    {
        long before = System.nanoTime();
        double sum = 0;
        for (int r=0; r<runs; r++)
        {
            for (int i=0; i<data.length-3; i++)
            {
                sum += minA(data[i], data[i+1], data[i+2]);
            }
        }
        long after = System.nanoTime();
        System.out.println("A: length "+data.length+", time "+(after-before)/1e6+", result "+sum);
    }

    private static void benchB(double data[], int runs)
    {
        long before = System.nanoTime();
        double sum = 0;
        for (int r=0; r<runs; r++)
        {
            for (int i=0; i<data.length-3; i++)
            {
                sum += minB(data[i], data[i+1], data[i+2]);
            }
        }
        long after = System.nanoTime();
        System.out.println("B: length "+data.length+", time "+(after-before)/1e6+", result "+sum);
    }

    private static double minA(double a, double b, double c)
    {
        return Math.min(a, Math.min(b, c));
    }

    private static double minB(double a, double b, double c)
    {
        if (a < b)
        {
            if (a < c)
            {
                return a;
            }
            return c;
        }
        if (b < c)
        {
            return b;
        }
        return c;
    }
}

(Disclaimer: Microbenchmarking in Java is an art, and for more reliable results, one should consider using JMH or Caliper).

Running this with JRE 1.8.0_31 may result in something like

....
A: length 90000, time 545.929078, result 2.247805342620906E7
B: length 90000, time 441.999193, result 2.247805342620906E7
A: length 100000, time 608.046928, result 2.5032781001456387E7
B: length 100000, time 493.747898, result 2.5032781001456387E7

This at least suggests that it might be possible to squeeze out a few percent here (again, in a very artifical example).


Analyzing this further, by looking at the hotspot disassembly output created with

java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintAssembly MinPerformance

one can see the optimized versions of both methods, minA and minB.

First, the output for the method that uses Math.min:

Decoding compiled method 0x0000000002992310:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x000000001c010910} &apos;minA&apos; &apos;(DDD)D&apos; in &apos;MinPerformance&apos;
  # parm0:    xmm0:xmm0   = double
  # parm1:    xmm1:xmm1   = double
  # parm2:    xmm2:xmm2   = double
  #           [sp+0x60]  (sp of caller)
  0x0000000002992480: mov    %eax,-0x6000(%rsp)
  0x0000000002992487: push   %rbp
  0x0000000002992488: sub    $0x50,%rsp
  0x000000000299248c: movabs $0x1c010cd0,%rsi
  0x0000000002992496: mov    0x8(%rsi),%edi
  0x0000000002992499: add    $0x8,%edi
  0x000000000299249c: mov    %edi,0x8(%rsi)
  0x000000000299249f: movabs $0x1c010908,%rsi   ; {metadata({method} {0x000000001c010910} &apos;minA&apos; &apos;(DDD)D&apos; in &apos;MinPerformance&apos;)}
  0x00000000029924a9: and    $0x3ff8,%edi
  0x00000000029924af: cmp    $0x0,%edi
  0x00000000029924b2: je     0x00000000029924e8  ;*dload_0
                        ; - MinPerformance::minA@0 (line 58)

  0x00000000029924b8: vmovsd %xmm0,0x38(%rsp)
  0x00000000029924be: vmovapd %xmm1,%xmm0
  0x00000000029924c2: vmovapd %xmm2,%xmm1       ;*invokestatic min
                        ; - MinPerformance::minA@4 (line 58)

  0x00000000029924c6: nop
  0x00000000029924c7: callq  0x00000000028c6360  ; OopMap{off=76}
                        ;*invokestatic min
                        ; - MinPerformance::minA@4 (line 58)
                        ;   {static_call}
  0x00000000029924cc: vmovapd %xmm0,%xmm1       ;*invokestatic min
                        ; - MinPerformance::minA@4 (line 58)

  0x00000000029924d0: vmovsd 0x38(%rsp),%xmm0   ;*invokestatic min
                        ; - MinPerformance::minA@7 (line 58)

  0x00000000029924d6: nop
  0x00000000029924d7: callq  0x00000000028c6360  ; OopMap{off=92}
                        ;*invokestatic min
                        ; - MinPerformance::minA@7 (line 58)
                        ;   {static_call}
  0x00000000029924dc: add    $0x50,%rsp
  0x00000000029924e0: pop    %rbp
  0x00000000029924e1: test   %eax,-0x27623e7(%rip)        # 0x0000000000230100
                        ;   {poll_return}
  0x00000000029924e7: retq
  0x00000000029924e8: mov    %rsi,0x8(%rsp)
  0x00000000029924ed: movq   $0xffffffffffffffff,(%rsp)
  0x00000000029924f5: callq  0x000000000297e260  ; OopMap{off=122}
                        ;*synchronization entry
                        ; - MinPerformance::minA@-1 (line 58)
                        ;   {runtime_call}
  0x00000000029924fa: jmp    0x00000000029924b8
  0x00000000029924fc: nop
  0x00000000029924fd: nop
  0x00000000029924fe: mov    0x298(%r15),%rax
  0x0000000002992505: movabs $0x0,%r10
  0x000000000299250f: mov    %r10,0x298(%r15)
  0x0000000002992516: movabs $0x0,%r10
  0x0000000002992520: mov    %r10,0x2a0(%r15)
  0x0000000002992527: add    $0x50,%rsp
  0x000000000299252b: pop    %rbp
  0x000000000299252c: jmpq   0x00000000028ec620  ; {runtime_call}
  0x0000000002992531: hlt
  0x0000000002992532: hlt
  0x0000000002992533: hlt
  0x0000000002992534: hlt
  0x0000000002992535: hlt
  0x0000000002992536: hlt
  0x0000000002992537: hlt
  0x0000000002992538: hlt
  0x0000000002992539: hlt
  0x000000000299253a: hlt
  0x000000000299253b: hlt
  0x000000000299253c: hlt
  0x000000000299253d: hlt
  0x000000000299253e: hlt
  0x000000000299253f: hlt
[Stub Code]
  0x0000000002992540: nop                       ;   {no_reloc}
  0x0000000002992541: nop
  0x0000000002992542: nop
  0x0000000002992543: nop
  0x0000000002992544: nop
  0x0000000002992545: movabs $0x0,%rbx          ; {static_stub}
  0x000000000299254f: jmpq   0x000000000299254f  ; {runtime_call}
  0x0000000002992554: nop
  0x0000000002992555: movabs $0x0,%rbx          ; {static_stub}
  0x000000000299255f: jmpq   0x000000000299255f  ; {runtime_call}
[Exception Handler]
  0x0000000002992564: callq  0x000000000297b9e0  ; {runtime_call}
  0x0000000002992569: mov    %rsp,-0x28(%rsp)
  0x000000000299256e: sub    $0x80,%rsp
  0x0000000002992575: mov    %rax,0x78(%rsp)
  0x000000000299257a: mov    %rcx,0x70(%rsp)
  0x000000000299257f: mov    %rdx,0x68(%rsp)
  0x0000000002992584: mov    %rbx,0x60(%rsp)
  0x0000000002992589: mov    %rbp,0x50(%rsp)
  0x000000000299258e: mov    %rsi,0x48(%rsp)
  0x0000000002992593: mov    %rdi,0x40(%rsp)
  0x0000000002992598: mov    %r8,0x38(%rsp)
  0x000000000299259d: mov    %r9,0x30(%rsp)
  0x00000000029925a2: mov    %r10,0x28(%rsp)
  0x00000000029925a7: mov    %r11,0x20(%rsp)
  0x00000000029925ac: mov    %r12,0x18(%rsp)
  0x00000000029925b1: mov    %r13,0x10(%rsp)
  0x00000000029925b6: mov    %r14,0x8(%rsp)
  0x00000000029925bb: mov    %r15,(%rsp)
  0x00000000029925bf: movabs $0x515db148,%rcx   ; {external_word}
  0x00000000029925c9: movabs $0x2992569,%rdx    ; {internal_word}
  0x00000000029925d3: mov    %rsp,%r8
  0x00000000029925d6: and    $0xfffffffffffffff0,%rsp
  0x00000000029925da: callq  0x00000000512a9020  ; {runtime_call}
  0x00000000029925df: hlt
[Deopt Handler Code]
  0x00000000029925e0: movabs $0x29925e0,%r10    ; {section_word}
  0x00000000029925ea: push   %r10
  0x00000000029925ec: jmpq   0x00000000028c7340  ; {runtime_call}
  0x00000000029925f1: hlt
  0x00000000029925f2: hlt
  0x00000000029925f3: hlt
  0x00000000029925f4: hlt
  0x00000000029925f5: hlt
  0x00000000029925f6: hlt
  0x00000000029925f7: hlt

One can see that the treatment of special cases involves some effort - compared to the output that uses simple comparisons, which is rather straightforward:

Decoding compiled method 0x0000000002998790:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x000000001c0109c0} &apos;minB&apos; &apos;(DDD)D&apos; in &apos;MinPerformance&apos;
  # parm0:    xmm0:xmm0   = double
  # parm1:    xmm1:xmm1   = double
  # parm2:    xmm2:xmm2   = double
  #           [sp+0x20]  (sp of caller)
  0x00000000029988c0: sub    $0x18,%rsp
  0x00000000029988c7: mov    %rbp,0x10(%rsp) ;*synchronization entry
                        ; - MinPerformance::minB@-1 (line 63)

  0x00000000029988cc: vucomisd %xmm0,%xmm1
  0x00000000029988d0: ja     0x00000000029988ee  ;*ifge
                        ; - MinPerformance::minB@3 (line 63)

  0x00000000029988d2: vucomisd %xmm1,%xmm2
  0x00000000029988d6: ja     0x00000000029988de  ;*ifge
                        ; - MinPerformance::minB@22 (line 71)

  0x00000000029988d8: vmovapd %xmm2,%xmm0
  0x00000000029988dc: jmp    0x00000000029988e2
  0x00000000029988de: vmovapd %xmm1,%xmm0 ;*synchronization entry
                        ; - MinPerformance::minB@-1 (line 63)

  0x00000000029988e2: add    $0x10,%rsp
  0x00000000029988e6: pop    %rbp
  0x00000000029988e7: test   %eax,-0x27688ed(%rip)        # 0x0000000000230000
                        ;   {poll_return}
  0x00000000029988ed: retq
  0x00000000029988ee: vucomisd %xmm0,%xmm2
  0x00000000029988f2: ja     0x00000000029988e2  ;*ifge
                        ; - MinPerformance::minB@10 (line 65)

  0x00000000029988f4: vmovapd %xmm2,%xmm0
  0x00000000029988f8: jmp    0x00000000029988e2
  0x00000000029988fa: hlt
  0x00000000029988fb: hlt
  0x00000000029988fc: hlt
  0x00000000029988fd: hlt
  0x00000000029988fe: hlt
  0x00000000029988ff: hlt
[Exception Handler]
[Stub Code]
  0x0000000002998900: jmpq   0x00000000028ec920  ;   {no_reloc}
[Deopt Handler Code]
  0x0000000002998905: callq  0x000000000299890a
  0x000000000299890a: subq   $0x5,(%rsp)
  0x000000000299890f: jmpq   0x00000000028c7340  ; {runtime_call}
  0x0000000002998914: hlt
  0x0000000002998915: hlt
  0x0000000002998916: hlt
  0x0000000002998917: hlt

Whether or not there are cases where such an optimization really makes a difference in an application is hard to tell. But at least, the bottom line is:

  • The Math#min(double,double) method is not the same as a simple comparison, and the treatment of the special cases does not come for free
  • There are cases where the special case treatment that is done by Math#min is not necessary, and then a comparison-based approach may be more efficient
  • As already pointed out in other answers: In most cases, the performance difference will not matter. However, for this particular example, one should probably create a utility method min(double,double,double) anyhow, for better convenience and readability, and then it would be easy to do two runs with the different implementations, and see whether it really affects the performance.

(Side note: The integer type methods, like Math.min(int,int) actually are a simple comparison - so I would expect no difference for these).


You can use ternary operator as follows:

smallest=(a<b)?((a<c)?a:c):((b<c)?b:c);

Which takes only one assignment and minimum two comparisons.

But I think that these statements would not have any effect on execution time, your initial logic will take same time as of mine and all of others.


OP's efficient code has a bug:

when a == b, and a (or b) < c, the code will pick c instead of a or b.