Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Choosing between a class and a record

Basic question: what design principles should one follow when choosing between using a class or using a record (with polymorphic fields) ?

First, we know that classes and records are essentially equivalent (since in Core, classes get desugared to dictionaries, which are just records). Nevertheless, there are differences: classes are passed implicitly, records must be explicit.

Looking a little deeper, classes are really useful when:

  1. we have many different representations of 'the same thing', and
  2. in actual usage, which representation is used can be inferred.

Classes are awkward when we have (up to parametric polymorphism) only one representation of our data, but we have multiple instances. This leads to the syntactic noise of having to use newtype to add extra tags (which exist only in our code, as we know such tags get erased at run time) if we don't want to turn on all sorts of troublesome extensions (i.e. overlapping and/or undecidable instances).

Of course, things get muddier: what if I want to have constraints on my types? Let's pick a real example:

class (Bounded i, Enum i) => Partition a i where     index :: a -> i 

I could just as easily have done

data Partition a i = Partition { index :: a -> i} 

But now I've lost my constraints, and I will have to add them to specific functions instead.

Are there design guidelines that would help me out?

like image 940
Jacques Carette Avatar asked Nov 12 '11 18:11

Jacques Carette


People also ask

Can a class inherit from a record?

Inheritance. This section only applies to record class types. A record can inherit from another record. However, a record can't inherit from a class, and a class can't inherit from a record.

What is the difference between class and record in Java?

A final class is simply one that cannot be extended. But that imposes no other constraints on the class; it can still have mutable fields, fully encapsulate its state, etc. A record is a transparent carrier for a given tuple of state components, and is required to expose an API derived from its state description.

When should I use records C#?

You create record types when you want value-based equality and comparison, don't want to copy values, and want to use reference variables. You create record struct types when you want the features of records for a type that is small enough to copy efficiently.

Can a record implement interface C#?

It's legal to implement an interface with a record.


2 Answers

I tend to see no issue with only requiring constraints on functions. The issue is, I suppose, that your data structure no longer models precisely what you intend it to. On the other hand, if you think of it as a data structure first and foremost, then that should matter less.

I feel like I don't necessarily still have a good grasp on the question, and this is about as vague as can be, but my rule of thumb tends to be that typeclasses are things that obey laws (or model meaning), and datatypes are things that encode a certain quantity of information.

When we want to layer behavior in complex ways, I've found that typeclasses start off enticingly, but can get painful quickly and switching to dictionary-passing makes things more straightforward. Which is to say that when we want implementations to be interoperable, then we should fall back to a uniform dictionary type.


This is take two, expanding a bit on a concrete example, but still just sort of spinning ideas...

Suppose we want to model probability distributions over the reals. Two natural representations come to mind.

A) Typeclass-driven

class PDist a where         sample :: a -> Gen -> Double 

B) Dictionary-driven

data PDist = PDist (Gen -> Double) 

The former lets us do

data NormalDist = NormalDist Double Double -- mean, var instance PDist NormalDist where...  data LognormalDist = LognormalDist Double Double instance PDist LognormalDist where... 

The latter lets us do

mkNormalDist :: Double -> Double -> PDist... mkLognormalDist :: Double -> Double -> PDist... 

In the former, we can write

data SumDist a b = SumDist a b instance (PDist a, PDist b) => PDist (SumDist a b)... 

in the latter we can simply write

sumDist :: PDist -> PDist -> PDist 

So what are the tradeoffs? Typeclass-driven lets us specify what distributions we're given. The tradeoff is that we have to construct an algebra of distributions explicitly, including new types for their combinations. Data-driven doesn't let us restrict the distributions we're given (or even if they're well-formed) but in return we can do whatever the heck we want.

Furthermore we can write a parseDist :: String -> PDist relatively easily, but we have to go through some angst to do the equiv for the typeclass approach.

So this is, in a sense the typed/untyped static/dynamic tradeoff at another level. We can give it a twist though, and argue that the typeclass, along with associated algebraic laws, specifies the semantics of a probability distribution. And the PDist type can indeed be made an instance of the PDist typeclass. Meanwhile, we can resign ourselves to using the PDist type (rather than typeclass) nearly everywhere, while thinking of it as iso to the tower of instances and datatypes necessary to use the typeclass more "richly."

In fact, we can even define basic PDist function in terms of typeclass functions. i.e. mkNormalPDist m v = PDist (sample $ NormalDist m v) So there's lots of room in the design space to slide between the two representations as necessary...

like image 151
sclv Avatar answered Sep 28 '22 00:09

sclv


Note: I'm not sure that I understand the OP exactly. Suggestions/comments for improvement appreciated!


Background:

When I first learned about typeclasses in Haskell, the general rule-of-thumb I picked up was that, in comparison to Java-like languages:

  • typeclasses are similar to interfaces
  • data are similar to classes

Here's another SO question and answer that describe guidelines for using interfaces (also some drawbacks of interface over-use). My interpretation:

  • records/Java-classes are what something is
  • interfaces/typeclasses are roles that a concretion can fulfil
  • multiple, unrelated concretions can fulfil the same role

I bet you already know all this.


The guidelines I try to follow for my own code are:

  • typeclasses are for abstractions
  • records are for concretions

So in practice this means:

  • let the needs of the data determine the records
  • let the client code determine what the interfaces are -- clients should depend on abstractions, and thereby drive the creation and design of typeclasses

Example:

typeclass Show, with function show :: (Show s) => s -> String: for data that can be represented as a String.

  • clients just want to turn data into strings
  • clients don't care what the data (concretion) is -- only care that it can be represented as a string
  • role of implementing data: can be string-ified
  • this could not be achieved without a typeclass -- each datatype would require a conversion function with a different name, what a pain to deal with!
like image 35
Matt Fenwick Avatar answered Sep 27 '22 23:09

Matt Fenwick