Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which dictionary does GHC choose when more than one is in scope?

Tags:

haskell

ghc

Consider the following example:

import Data.Constraint

class Bar a where
  bar :: a -> a

foo :: (Bar a) => Dict (Bar a) -> a -> a
foo Dict = bar

GHC has two choices for the dictionary to use when selecting a Bar instance in foo: it could use the dictionary from the Bar a constraint on foo, or it could use the runtime Dict to get a dictionary. See this question for an example where the dictionaries correspond to different instances.

Which dictionary does GHC use, and why is it the "correct" choice?

like image 378
crockeea Avatar asked Apr 08 '15 01:04

crockeea


2 Answers

It just picks one. This isn't the correct choice; it's a pretty well-known wart. You can cause crashes this way, so it's a pretty bad state of affairs. Here is a short example using nothing but GADTs that demonstrates that it is possible to have two different instances in scope at once:

-- file Class.hs
{-# LANGUAGE GADTs #-}
module Class where

data Dict a where
  Dict :: C a => Dict a

class C a where
  test :: a -> Bool

-- file A.hs
module A where

import Class

instance C Int where
  test _ = True

v :: Dict Int
v = Dict

-- file B.hs
module B where

import Class

instance C Int where
  test _ = False

f :: Dict Int -> Bool
f Dict = test (0 :: Int)

-- file Main.hs
import TestA
import TestB

main = print (f v)

You will find that Main.hs compiles just fine, and even runs. It prints True on my machine with GHC 7.10.1, but that's not a stable outcome. Turning this into a crash is left to the reader.

like image 112
Daniel Wagner Avatar answered Nov 16 '22 00:11

Daniel Wagner


GHC just picks one, and this is the correct choice. Any two dictionaries for the same constraint are supposed to be equal.

OverlappingInstances and IncoherentInstances are basically equivalent in destructive power; they both lose instance coherence by design (any two equal constraints in your program being satisfied by the same dictionary). OverlappingInstances gives you a little more ability to work out which instances will be used on a case-by-case basis, but this isn't that useful when you get to the point of passing around Dicts as first class values and so on. I would only consider using OverlappingInstances when I consider the overlapping instances extensionally equivalent (e.g., a more efficient but otherwise equal implementation for a specific type like Int), but even then, if I care enough about performance to write that specialized implementation, isn't it a performance bug if it doesn't get used when it could be?

In short, if you use OverlappingInstances, you give up the right to ask the question of which dictionary will be selected here.

Now it's true that you can break instance coherence without OverlappingInstances. In fact you can do it without orphans and without any extensions other than FlexibleInstances (arguably the problem is that the definition of "orphan" is wrong when FlexibleInstances is enabled). This is a very long-standing GHC bug, which hasn't been fixed in part because (a) it actually can't cause crashes directly as far as anybody seems to know, and (b) there might be a lot of programs that actually rely on having multiple instances for the same constraint in separate parts of the program, and that might be hard to avoid.

Getting back to the main topic, in principle it's important that GHC can select any dictionary that it has available to satisfy a constraint, because even though they are supposed to be equal, GHC might have more static information about some of them than others. Your example is a little bit too simple to be illustrative but imagine that you passed an argument to bar; in general GHC doesn't know anything about the dictionary passed in via Dict so it has to treat this as a call to an unknown function, but you called foo at a specific type T for which there was a Bar T instance in scope, then GHC would know that the bar from the Bar a constraint dictionary was T's bar and could generate a call to a known function, and potentially inline T's bar and do more optimizations as a result.

In practice, GHC is currently not this smart and it just uses the innermost dictionary available. It would probably be already better to always use the outermost dictionary. But cases like this where there are multiple dictionaries available are not very common, so we don't have good benchmarks to test on.

like image 26
Reid Barton Avatar answered Nov 16 '22 02:11

Reid Barton