Scala specialization for numeric operation of primitive types

I wrote a function doing simple math:

def clamp(num: Double, min: Double, max: Double) =
  if (num < min) min else if (num > max) max else num

It is very simple, until I needed the same function with Long type. I generalized it with type parameter and specialization:

import Ordering.Implicits._
def clamp[@specialized N: Ordering](num: N, min: N, max: N) =
  if (num < min) min else if (num > max) max else num

It works, but I found that the bytecode does lots of boxing and unboxing under the hood:

public boolean clamp$mZc$sp(boolean num, boolean min, boolean max, Ordering<Object> evidence$1)
  return Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToBoolean(num), evidence$1).$greater(BoxesRunTime.boxToBoolean(max)) ? max : Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToBoolean(num), evidence$1).$less(BoxesRunTime.boxToBoolean(min)) ? min : num;

public byte clamp$mBc$sp(byte num, byte min, byte max, Ordering<Object> evidence$1)
  return Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToByte(num), evidence$1).$greater(BoxesRunTime.boxToByte(max)) ? max : Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToByte(num), evidence$1).$less(BoxesRunTime.boxToByte(min)) ? min : num;

public char clamp$mCc$sp(char num, char min, char max, Ordering<Object> evidence$1)
  return Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToCharacter(num), evidence$1).$greater(BoxesRunTime.boxToCharacter(max)) ? max : Ordering.Implicits..MODULE$.infixOrderingOps(BoxesRunTime.boxToCharacter(num), evidence$1).$less(BoxesRunTime.boxToCharacter(min)) ? min : num;

Is there any better way to do generalized arithmetic operations without boxing?

1 Answers

The spire project is definitely the right place to look for high performance numerical abstractions. All its typeclasses are specialized for common types such as long, double, float, int.

Here is your method using spire typeclasses:

import spire.algebra._
import spire.implicits._
def clamp[@specialized T:Order](a: T, min: T, max: T) =
  if(a < min) min else if(a > max) max else a

And here is the specialized bytecode (long version), extracted using :javap from the scala REPL:

public long clamp$mJc$sp(long, long, long, spire.algebra.Order<java.lang.Object>);
    descriptor: (JJJLspire/algebra/Order;)J
    flags: ACC_PUBLIC
      stack=5, locals=8, args_size=5
         0: aload         7
         2: lload_1
         3: lload_3
         4: invokeinterface #96,  5           // InterfaceMethod spire/algebra/Order.lt$mcJ$sp:(JJ)Z
         9: ifeq          16
        12: lload_3
        13: goto          35
        16: aload         7
        18: lload_1
        19: lload         5
        21: invokeinterface #99,  5           // InterfaceMethod spire/algebra/Order.gt$mcJ$sp:(JJ)Z
        26: ifeq          34
        29: lload         5
        31: goto          35
        34: lload_1
        35: lreturn

As you can see, it is calling the long specialized version of the gt method of spire.algebra.Order, so there is no boxing involved.

You can also notice that the transformation from the operators (< and >) to the typeclass method invocation does not appear in the code. The machinery behind this is quite elaborate. See this blog post from Erik Osheim, one of the main authors of spire.

But the bottom line is that the result is very fast even though the code is generic.

