I'm summing a long list of Ratios in Clojure, something like:
(defn sum-ratios
[n]
(reduce
(fn [total ind]
(+
total
(/
(inc (rand-int 100))
(inc (rand-int 100)))))
(range 0 n)))
The runtime for various n is:
The (less precise) alternative is to sum these values as doubles:
(defn sum-doubles
[n]
(reduce
(fn [total ind]
(+
total
(double
(/
(inc (rand-int 100))
(inc (rand-int 100))))))
(range 0 n)))
The runtime for this version is:
Why is it significantly slower to sum Ratios? I'm guessing that it has to do with finding the least common multiple of the denominators of the Ratios being summed, but does anybody know specifically which algorithm Clojure uses to sum Ratios?
Let's follow what happens when you +
two Ratio
s, which is what happens at each step in the reduction. We start at the two-arity version of +:
([x y] (. clojure.lang.Numbers (add x y)))
This takes us to Numbers.add(Obj, Obj)
:
return ops(x).combine(ops(y)).add((Number)x, (Number)y);
ops
looks at the class of the first operand and will find RatioOps
. This leads to the RatioOps.add
function:
final public Number add(Number x, Number y){
Ratio rx = toRatio(x);
Ratio ry = toRatio(y);
Number ret = divide(ry.numerator.multiply(rx.denominator)
.add(rx.numerator.multiply(ry.denominator))
, ry.denominator.multiply(rx.denominator));
return normalizeRet(ret, x, y);
}
So here's your algorithm. There are five BigInteger
operations here (three multiplies, one add, one divide):
(yn*xd + xn*yd) / (xd*yd)
You can see how multiply is implemented; it alone is not trivial, and you can examine the others for yourself.
Sure enough, the divide function involves finding the gcd
between the two numbers so it can be reduced:
static public Number divide(BigInteger n, BigInteger d){
if(d.equals(BigInteger.ZERO))
throw new ArithmeticException("Divide by zero");
BigInteger gcd = n.gcd(d);
if(gcd.equals(BigInteger.ZERO))
return BigInt.ZERO;
n = n.divide(gcd);
d = d.divide(gcd);
...
}
The gcd
function creates two new MutableBigInteger
objects.
Computationally, it's expensive, as you can see from all of the above. However, don't discount the cost of extra incidental object creation (as in gcd
above), as that is often more expensive as we involve non-cached memory access.
The double
conversion is not free, FWIW, as it involves a division of two newly-created BigDecimal
s.
You really need a profiler to see exactly where the cost is. But hopefully the above gives a little context.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With