Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting GHC to produce "Add With Carry (ADC)" instructions

Here is code that adds two triples of unboxed Words representing a 192 bit number into a new triple of unboxed Words, and also returns any overflow:

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}

import GHC.Prim(plusWord2#, Word#, or#)

longAdd :: 
  (# Word#, Word#, Word# #) -> 
  (# Word#, Word#, Word# #) -> 
  (# Word#, (# Word#, Word#, Word# #) #)

longAdd (# xl, xm, xh #) (# yl, ym, yh #) =     
  let
    plusWord3 x y c = 
      let 
        (# c1, r1 #) = plusWord2# x y
        (# c2, r2 #) = plusWord2# r1 c
      in
        (# plusWord# c1 c2, r2 #)
    (# cl, rl #) = plusWord2# xl yl
    (# cm, rm #) = plusWord3 xm ym cl
    (# ch, rh #) = plusWord3 xh yh cm     
  in
    (# ch, (# rl, rm, rh #) #)

The issue is the "plusWord3" definition. Ideally, this is just like an "adc" function, which takes two words and the carry bit and returns the result and a new carry, so the resulting assembly is like the following:

add x1 y1
adc x2 y2
adc x3 y3

Unfortunately GHC, whether native or via LLVM, produce ugly assembly code that involves saving the carry bit to a register and then reading it via a separate extra add, instead of just using adc. I don't want to call an external C function to achieve this, as once you add the call overhead it's probably not worth it, I'd like to stay in Haskell so the code can be inlined where possible. But I also want to be able to coax the compiler into producing the adc instruction appropriately. Is there anyway I can achieve that?

like image 728
Clinton Avatar asked Nov 08 '15 14:11

Clinton


2 Answers

Most realiable and efficient way would be calling a primop directly in your program.

Using a FFI call is the easiest way but as you also noted it won't be the most efficient way, because of the FFI overheads.

Even if the compiler would support the instruction you want and use it in some programs, it would be fragile. Some seemingly innocent changes in your program may end up with different generated assembly that doesn't use the instruction you want.

So my proposal is:

  1. Add the instruction you need to X86 code generator backend, if it isn't there already.
  2. Add a primop that translates directly to the instruction you want to run. First make sure no such primop exists. Then follow these steps: https://ghc.haskell.org/trac/ghc/wiki/AddingNewPrimitiveOperations
  3. You primop should be visible in GHC.Prim (http://hackage.haskell.org/package/ghc-prim/docs/GHC-Prim.html), use it in your programs.
  4. Add tests, submit your patch :)
like image 52
sinan Avatar answered Nov 10 '22 01:11

sinan


I'm not familiar with low-level programming, but after question round on Freenode's #ghc channel, I got a pointer to addIntC# primop, which is related to LLVM's llvm.sadd.with.overflow.. I'm not sure what llvm compiles that into.


The native code gen of GHC seems to know about adc instruction: X86/CodeGen.hs. But as comment says:

we handle addition, but rather badly


Edit: you work with words. Seems that LLVM backend does compile MO_Add2 (which is another name for plusWord2) to llvm.uadd.with.overflow in https://github.com/ghc/ghc/blob/2b7d9c2b96eb9da3cce7826df4a91c3426095528/compiler/llvmGen/LlvmCodeGen/CodeGen.hs#L737 , related ticket: https://ghc.haskell.org/trac/ghc/ticket/9430

like image 40
phadej Avatar answered Nov 10 '22 00:11

phadej