Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which one of theese XOR are considered better in Haskell

I would like to know what is the most standard way in Haskell.

The first one states clearly that we want two arguments (most of the time).

The second involves a function call (id) in the second clause, so it should be less efficient because in the first implementation we can simply return the second argument.

So i tend to think that the first is better and should be the one to pick : easier to read and to figure out what it does[1], and a function call save.

But i'm newbie to Haskell, maybe the compiler optimize this extra call.

xor :: Bool -> Bool -> Bool

xor True x = not x
xor False x = x

xor True = not
xor False = id

Also, i would like to know if i can replace both False with a wildcard there.

So, what is the good practice in Haskell. Maybe another implementation ?

[1] We omit there that it is a well known functionallity, let's imagine it is a non-trivial function.

Thanks

like image 992
lud Avatar asked Jul 08 '13 19:07

lud


3 Answers

For readability, I would try to avoid pattern matching and define the function with a single equation that expresses something interesting about the function to be defined. That's not always possible, but for this example, there are many options:

  • xor = (/=)
  • xor a b = a /= b
  • xor a b = not (a == b)
  • xor a b = (a && not b) || (not a && b)
  • xor a b = (a || b) && not (a && b)
  • xor a b = odd (fromEnum a + fromEnum b)
like image 200
Toxaris Avatar answered Dec 06 '22 13:12

Toxaris


Of course it depends on the compiler and the options passed to the compiler.

For this particular example, if you compile without optimisations, GHC produces the code as you have written it, so the second version contains a call to id resp. to not. That is slightly less efficient than the first version, which then only contains the call to not:

Xors.xor1 :: GHC.Types.Bool -> GHC.Types.Bool -> GHC.Types.Bool
[GblId, Arity=2]
Xors.xor1 =
  \ (ds_dkm :: GHC.Types.Bool) (x_aeI :: GHC.Types.Bool) ->
    case ds_dkm of _ {
      GHC.Types.False -> x_aeI;
      GHC.Types.True -> GHC.Classes.not x_aeI
    }

Xors.xor2 :: GHC.Types.Bool -> GHC.Types.Bool -> GHC.Types.Bool
[GblId, Arity=1]
Xors.xor2 =
  \ (ds_dki :: GHC.Types.Bool) ->
    case ds_dki of _ {
      GHC.Types.False -> GHC.Base.id @ GHC.Types.Bool;
      GHC.Types.True -> GHC.Classes.not
    }

(the calls are still in the produced assembly, but core is more readable, so I post only that).

But with optimisations, both functions compile to the same core (and thence to the same machine code),

Xors.xor2 =
  \ (ds_dkf :: GHC.Types.Bool) (eta_B1 :: GHC.Types.Bool) ->
    case ds_dkf of _ {
      GHC.Types.False -> eta_B1;
      GHC.Types.True ->
        case eta_B1 of _ {
          GHC.Types.False -> GHC.Types.True;
          GHC.Types.True -> GHC.Types.False
        }
    }

GHC eta-expanded the second version and inlined the calls to id and not, you get pure pattern-matching.

Whether the second equation uses False or a wildcard makes no difference in either version, with or without optimisations.

maybe the compiler optimize this extra call.

If you ask it to optimise, in simple cases like this, GHC will eliminate the extra call.

let's imagine it is a non-trivial function.

Here's a possible problem. If the code is non-trivial enough, the compiler may not be able to eliminate all calls introduced by defining the function with not all arguments supplied. GHC is rather good at doing that and inlining calls, though, so you need a fair amount of non-triviality to make GHC fail eliminating calls to simple functions it knows when compiling your code (it can of course never inline calls to functions it doesn't know the implementation of when compiling the module in question).

If it's critical code, always check what code the compiler produces, for GHC, the relevant flags are -ddump-simpl to get the core produced after optimisations, and -ddump-asm to get the produced assembly.

like image 43
Daniel Fischer Avatar answered Dec 06 '22 12:12

Daniel Fischer


So i tend to think that the first is better and should be the one to pick : easier to read and to figure out what it does

I agree about readability. However, the second one is very much idiomatic Haskell and rather easier to read for experienced programmers: not performing that trivial eta reduction is quite suspicious and might actually distract from the intend. So for an optimised version, I'd rather write it out completely in explicit form:

True  `xor` False = True
False `xor` True  = True
_     `xor` _     = False

However, if such an alternative is considerably less readable than the most idiomatic one you should consider not replacing it, but adding hints so the compiler can still optimise it to the ideal version. As demonstrated by Daniel Fischer, GHC is quite clever by itself and will often get it right without help; when it doesn't it might help to add some INLINE and/or RULES pragmas. It's not easy to figure out how to do this to get optimal performance, but the same is true for writing fast Haskell98 code.

like image 39
leftaroundabout Avatar answered Dec 06 '22 14:12

leftaroundabout