I have a (fairly) legitimate case where there are two type instance implementations, and I want to specify a default one. After noting that doing modular arithmetic with Int
types resulted in lots of hash collisions, I want to try GHC's Int64. I have the following code:
class Hashable64 a where
hash64 :: a -> Int64
instance Hashable64 a => Hashable a where
hash = fromInteger . toInteger . hash64
instance Hashable64 a => Hashable64 [a] where
hash64 = foldl1 (\x y -> x + 22636946317 * y) . map hash64
and an instance Hashable64 Char
, which thus results in two implementations for Hashable String
, namely:
Data.Hashable
.Hashable64
instance, then converting to a regular Int
for an instance of Data.Hashable
.The second code path may be better because it performs hashing with Int64
s. Can I specify to use this derivation of the instance Hashable String?
Sorry, I forgot to add I have already tried the overlapping instances thing; perhaps I'm just not implementing it correctly? The documentation for overlapping instances says it works when one instance is more specific. But when I try to add a specific instance for Hashable String
, the situation doesn't improve. Full code at [ http://pastebin.com/9fP6LUX2 ] (sorry for the superfluous default header).
instance Hashable String where
hash x = hash (hash64 x)
I get
Matching instances:
instance (Hashable a) => Hashable [a] -- Defined in Data.Hashable
instance [overlap ok] Hashable String
-- Defined at Hashable64.hs:70:9-23
Any other solutions to this specific problem are welcome. A good solution might provide insight into this overlapping instances problem.
What's a typeclass in Haskell? A typeclass defines a set of methods that is shared across multiple types. For a type to belong to a typeclass, it needs to implement the methods of that typeclass. These implementations are ad-hoc: methods can have different implementations for different types.
If you actually want to define an instance, then you, well, instantiate n with N : instance Printable N where print n = ... read str = ... At this point the type signatures are all fixed (from the class definition) and what you need to write are the actual bindings of these functions, hence it must be = , not :: .
An instance of a class is an individual object which belongs to that class. In Haskell, the class system is (roughly speaking) a way to group similar types. (This is the reason we call them "type classes"). An instance of a class is an individual type which belongs to that class.
The second line, deriving (Eq, Show) , is called the deriving clause; it specifies that we want the compiler to automatically generate instances of the Eq and Show classes for our Pair type. The Haskell Report defines a handful of classes for which instances can be automatically generated.
This sort of situation is handled by GHC's OverlappingInstances extension. Roughly speaking, this extension allows instances to coexist despite the existence of some type(s) to which both could apply. For such types, GHC will select the "most specific" instance, which is a little fuzzy in some cases but usually does what you'd want it to.
This sort of situation, where you have one or more specialized instances and a single catch-all instance Foo a
as the "default", usually works pretty well.
The main stumbling blocks to be aware of with overlapping instances are:
If something forces GHC to select an instance on a polymorphic type that's ambiguous, it will refuse with potentially cryptic compiler errors
The context of an instance is ignored until after it's been selected, so don't try to distinguish between instances that way; there are workarounds but they're annoying.
The latter point would be relevant here if, for example, you have a list of some type that's not an instance of Hashable64
; GHC will select the more specific second instance, then fail because of the context, even if the full type (the list, not the element type) is an instance of Hashable64
and thus would work with the first, generic instance.
Edit: Oh, I see, I misinterpreted the situation slightly, regarding where the instances are coming from. Quoth the GHC User's Guide:
The willingness to be overlapped or incoherent is a property of the instance declaration itself (...). Neither flag is required in a module that imports and uses the instance declaration.
(...)
These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know.
If an instance declaration is compiled without
-XOverlappingInstances
, then that instance can never be overlapped. This could perhaps be inconvenient. (...) We are interested to receive feedback on these points.
...in other words, overlapping is only allowed if the less specific instance was built with OverlappingInstances
enabled, which the instance for Hashable [a]
was not. So your instance for Hashable a
is allowed, but one for Hashable [Char]
fails, as observed.
This is a tidy illustration of why the User's Guide finds the current rules unsatisfactory (other rules would have their own problems, so it's not clear what the best approach, if any, would be).
Back in the here-and-now, you have multiple options, which are slightly less convenient than what you'd hoped for. Off the top of my head:
Alternate class: Define your equivalent of the Hashable
class, write the overlapped instances you want, and use generic instances with Hashable
in the context to fall back to the original as needed. This is problematic if you're using another library that expects Hashable
instances, rather than pre-hashed values or an explicit hash function.
Type wrapper: newtype
wrappers are something of a "standard" way to disambiguate instances (c.f. Monoid
). By using such a wrapper around your values, you'll be able to write whatever instances you please because none of the pre-defined instances will match. This becomes problematic if you have a lot of functions that would need to wrap/unwrap the newtype, though keep in mind that you can define other instances (e.g., Num
, Show
, etc.) for the wrapper easily and there's no overhead at run time.
There are other, more arcane, workarounds, but I can't offer too much explicit guidance because which is the least awkward tends to be very situation-dependent.
It's worth noting that you're definitely pushing the edges of what can be sensibly expressed with type classes, so it's not surprising that things are awkward. It's not a very satisfying situation, but there's little you can do when constrained to adding instances for a class defined elsewhere.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With