Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify a typeclass instance?

I have a (fairly) legitimate case where there are two type instance implementations, and I want to specify a default one. After noting that doing modular arithmetic with Int types resulted in lots of hash collisions, I want to try GHC's Int64. I have the following code:

class Hashable64 a where
    hash64 :: a -> Int64

instance Hashable64 a => Hashable a where
    hash = fromInteger . toInteger . hash64

instance Hashable64 a => Hashable64 [a] where
    hash64 = foldl1 (\x y -> x + 22636946317 * y) . map hash64

and an instance Hashable64 Char, which thus results in two implementations for Hashable String, namely:

  • The one defined in Data.Hashable.
  • Noting that it is an Hashable64 instance, then converting to a regular Int for an instance of Data.Hashable.

The second code path may be better because it performs hashing with Int64s. Can I specify to use this derivation of the instance Hashable String?

Edit 1

Sorry, I forgot to add I have already tried the overlapping instances thing; perhaps I'm just not implementing it correctly? The documentation for overlapping instances says it works when one instance is more specific. But when I try to add a specific instance for Hashable String, the situation doesn't improve. Full code at [ http://pastebin.com/9fP6LUX2 ] (sorry for the superfluous default header).

instance Hashable String where
    hash x = hash (hash64 x)

I get

Matching instances:
  instance (Hashable a) => Hashable [a] -- Defined in Data.Hashable
  instance [overlap ok] Hashable String
    -- Defined at Hashable64.hs:70:9-23

Edit 2

Any other solutions to this specific problem are welcome. A good solution might provide insight into this overlapping instances problem.

like image 267
gatoatigrado Avatar asked May 31 '11 22:05

gatoatigrado


People also ask

How do you define Typeclass in Haskell?

What's a typeclass in Haskell? A typeclass defines a set of methods that is shared across multiple types. For a type to belong to a typeclass, it needs to implement the methods of that typeclass. These implementations are ad-hoc: methods can have different implementations for different types.

How do I make an instance in Haskell?

If you actually want to define an instance, then you, well, instantiate n with N : instance Printable N where print n = ... read str = ... At this point the type signatures are all fixed (from the class definition) and what you need to write are the actual bindings of these functions, hence it must be = , not :: .

What is an instance of a Haskell type class?

An instance of a class is an individual object which belongs to that class. In Haskell, the class system is (roughly speaking) a way to group similar types. (This is the reason we call them "type classes"). An instance of a class is an individual type which belongs to that class.

What does deriving show mean Haskell?

The second line, deriving (Eq, Show) , is called the deriving clause; it specifies that we want the compiler to automatically generate instances of the Eq and Show classes for our Pair type. The Haskell Report defines a handful of classes for which instances can be automatically generated.


1 Answers

This sort of situation is handled by GHC's OverlappingInstances extension. Roughly speaking, this extension allows instances to coexist despite the existence of some type(s) to which both could apply. For such types, GHC will select the "most specific" instance, which is a little fuzzy in some cases but usually does what you'd want it to.

This sort of situation, where you have one or more specialized instances and a single catch-all instance Foo a as the "default", usually works pretty well.

The main stumbling blocks to be aware of with overlapping instances are:

  • If something forces GHC to select an instance on a polymorphic type that's ambiguous, it will refuse with potentially cryptic compiler errors

  • The context of an instance is ignored until after it's been selected, so don't try to distinguish between instances that way; there are workarounds but they're annoying.

The latter point would be relevant here if, for example, you have a list of some type that's not an instance of Hashable64; GHC will select the more specific second instance, then fail because of the context, even if the full type (the list, not the element type) is an instance of Hashable64 and thus would work with the first, generic instance.


Edit: Oh, I see, I misinterpreted the situation slightly, regarding where the instances are coming from. Quoth the GHC User's Guide:

The willingness to be overlapped or incoherent is a property of the instance declaration itself (...). Neither flag is required in a module that imports and uses the instance declaration.

(...)

These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know.

If an instance declaration is compiled without -XOverlappingInstances, then that instance can never be overlapped. This could perhaps be inconvenient. (...) We are interested to receive feedback on these points.

...in other words, overlapping is only allowed if the less specific instance was built with OverlappingInstances enabled, which the instance for Hashable [a] was not. So your instance for Hashable a is allowed, but one for Hashable [Char] fails, as observed.

This is a tidy illustration of why the User's Guide finds the current rules unsatisfactory (other rules would have their own problems, so it's not clear what the best approach, if any, would be).

Back in the here-and-now, you have multiple options, which are slightly less convenient than what you'd hoped for. Off the top of my head:

  • Alternate class: Define your equivalent of the Hashable class, write the overlapped instances you want, and use generic instances with Hashable in the context to fall back to the original as needed. This is problematic if you're using another library that expects Hashable instances, rather than pre-hashed values or an explicit hash function.

  • Type wrapper: newtype wrappers are something of a "standard" way to disambiguate instances (c.f. Monoid). By using such a wrapper around your values, you'll be able to write whatever instances you please because none of the pre-defined instances will match. This becomes problematic if you have a lot of functions that would need to wrap/unwrap the newtype, though keep in mind that you can define other instances (e.g., Num, Show, etc.) for the wrapper easily and there's no overhead at run time.

There are other, more arcane, workarounds, but I can't offer too much explicit guidance because which is the least awkward tends to be very situation-dependent.

It's worth noting that you're definitely pushing the edges of what can be sensibly expressed with type classes, so it's not surprising that things are awkward. It's not a very satisfying situation, but there's little you can do when constrained to adding instances for a class defined elsewhere.

like image 108
C. A. McCann Avatar answered Sep 17 '22 22:09

C. A. McCann