I frequently have occasion to perform modular arithmetic in Haskell, where the modulus is usually large and often prime (like 2000000011). Currently, I just use functions like (modAdd m a b), (modMul m a b), (modDiv m a b) etc. But that is rather inconvenient, requiring an additional parameter to always be specified and carried around and creating my various functions in both the regular integral form and separately in mod- form.
So it occurs that it might be a good idea to create a new class something like this:
class Integral a => Mod a
m = 2000000011
instance Integral a => Num (Mod a) where
... defining (+), etc. as modulo m
Then one could just perform regular arithmetic, using regular functions, and define useful structures like
factorials :: [Mod Int]
factorials = 1:zipWith (*) factorials [1..]
But that has one problem: All values of type Mod Int necessarily must have the same modulus. However, often I need to work in multiple moduli in one program (of course always only combining values of the same modulus).
I think, but do not understand sufficiently well to be sure, that this could be overcome by something like this:
class Integral a => Mod Nat a
where Nat is a type that encodes the modulus in Peano fashion. This would be advantageous: I could have values of different moduli and the type-checker would save me from accidentally combining this value.
Is something like this feasible and efficient? Will it cause either the compiler or the RTS to try to actually construct the enormous (Succ (Succ (Succ ... repeated 2000000011 times) if I try to use that modulus, rendering the solution effectively useless? Will the RTS try to check the type match on every operation? Will the RTS representation of every value be blown up from what could otherwise be just an unboxed int?
Is there a better way?
Thanks to helpful comments from cirdec, dfeuer, user5402, and tikhon-jelvis, I learned that (unsurprisingly) I was not the first to have this idea. In particular, there is a recent paper by Kiselyov and Shan that gives an implementation and tikhon-jelvis has posted to Hackage a solution called (surprise!) modular-arithmetic, that offers even nicer semantics using fancy ghc pragmas.
What happens behind the curtain? In particular, would a million-element list of [Mod Int 2000000011] carry around an extra million copies of 2000000011 around? Or does it compile to the same code as a list of a million Int with the modulus parameter carried separately would? The latter would be nice.
I ran a bit of a benchmark on a current problem I am working on. The first run used an unboxed 10,000-element Int vector and performed 10,000 operations on it:
4,810,589,520 bytes allocated in the heap
107,496 bytes copied during GC
1,197,320 bytes maximum residency (1454 sample(s))
734,960 bytes maximum slop
10 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 6905 colls, 0 par 0.109s 0.101s 0.0000s 0.0006s
Gen 1 1454 colls, 0 par 0.812s 0.914s 0.0006s 0.0019s
TASKS: 13 (1 bound, 12 peak workers (12 total), using -N11)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.001s elapsed)
MUT time 2.672s ( 2.597s elapsed)
GC time 0.922s ( 1.015s elapsed)
EXIT time 0.000s ( 0.001s elapsed)
Total time 3.594s ( 3.614s elapsed)
Alloc rate 1,800,454,557 bytes per MUT second
Productivity 74.3% of total user, 73.9% of total elapsed
For the second run, I performed the same operations on an unboxed Vector of 10,000 (Mod Int 1000000007). That made my code a little simpler, but took about 3 times as long (while having nearly identical memory profile):
4,810,911,824 bytes allocated in the heap
107,128 bytes copied during GC
1,199,408 bytes maximum residency (1453 sample(s))
736,928 bytes maximum slop
10 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 6906 colls, 0 par 0.094s 0.107s 0.0000s 0.0007s
Gen 1 1453 colls, 0 par 1.516s 1.750s 0.0012s 0.0035s
TASKS: 13 (1 bound, 12 peak workers (12 total), using -N11)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.001s elapsed)
MUT time 8.562s ( 8.323s elapsed)
GC time 1.609s ( 1.857s elapsed)
EXIT time 0.000s ( 0.001s elapsed)
Total time 10.172s ( 10.183s elapsed)
Alloc rate 561,858,315 bytes per MUT second
Productivity 84.2% of total user, 84.1% of total elapsed
I wonder why that happens and if it could be fixed. Still, I really like the modular-arithmetic package and will use it where performance is not absolutely critical.
Newer versions of GHC have type-level numbers built in, which should be more efficient than ones you roll yourself using Peano arithmetic. You can use them by enabling DataKinds
. As a bonus, you'd also get some nice syntax:
factorials :: [Mod Int 20]
Whether this is efficient or not depends on how you implement the Mod
type. In the most general case, you probably want to just mod
after every arithmetic operation. Unless you're in a hot loop where saving a handful of instructions matters, this should be fine. (And inside the hot loop, it's probably better to be explicit about when you mod anyhow.)
I actually implemented this type in a library on Hackage: modular-arithmetic
. It has a test suite but no benchmarks so I can't vouch for the absolute performance, but it's not doing anything that should be slow and it was fast enough for my purposes. (Which, admittedly, involved small moduli.) If you try it and run into performance issues, I'd love to hear about them so I can try to fix them.
Here is some working code using Data.Reflection
:
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE FlexibleContexts #-}
import Data.Reflection
import Data.Proxy
data M a s = M a -- Note the phantom comes *after* the concrete
-- In `normalize` we're tying the knot to get the phantom types to align
-- note that reflect :: Reifies s a => forall proxy. proxy s -> a
normalize :: (Reifies s a, Integral a) => a -> M a s
normalize a = b where b = M (mod a (reflect b))
instance (Reifies s a, Integral a) => Num (M a s) where
M a + M b = normalize (a + b)
M a - M b = normalize (a - b)
M a * M b = normalize (a * b)
fromInteger n = normalize (fromInteger n)
abs _ = error "abs not implemented"
signum _ = error "sgn not implemented"
withModulus :: Integral a => a -> (forall s. Reifies s a => M a s) -> a
withModulus m ma = reify m (runM . asProxyOf ma)
where asProxyOf :: f s -> Proxy s -> f s
asProxyOf a _ = a
runM :: M a s -> a
runM (M a) = a
example :: (Reifies s a, Integral a) => M a s
example = normalize 3
example2 :: (Reifies s a, Integral a, Num (M a s)) => M a s
example2 = 3*3 + 5*5
mfactorial :: (Reifies s a, Integral a, Num (M a s)) => Int -> M a s
mfactorial n = product $ map fromIntegral [1..n]
test1 p n = withModulus p $ mfactorial n
madd :: (Reifies s Int, Num (M Int s)) => M Int s -> M Int s -> M Int s
madd a b = a + b
test2 :: Int -> Int -> Int -> Int
test2 p a b = withModulus p $ madd (fromIntegral a) (fromIntegral b)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With