Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use additional types for extra type safety in Haskell

Tags:

types

haskell

I'm new to Haskell, and enjoying myself immensely.

As an exercise, I've written a program which tinkers with dates and times. In particular, I'm doing calculations involving minutes, seconds, and microseconds. Now I'm finding, while debugging, that I have a lot of errors where, for example, I'm adding minutes to seconds without multiplying by 60.

In order to move the debugging from run time to compile time, it occurred to me that I could use "type synonyms plus polymorphic functions" do something like this:

module Main where
type SecX = Integer
toMin :: SecX -> MinX
toMin m = div m 60
type MinX = Integer
toSec :: MinX -> SecX
toSec = (60 *)
main :: IO ()
main = do
  let x = 20 :: MinX
  let y = 20 :: SecX
  let z = x + y       -- should not compile
  print [x,y,z]

but this approach gives me two problems:

  1. The line marked "should not compile" does in fact compile, and then proceeds to add 20 minutes to 20 seconds to give 40 somethings
  2. When I add a further type of MuSecX for microseconds, I can't compile additional instances of toMin and toSec:
    type MuSecX = Integer  
    toSec :: MuSecX -> SecX  
    toSec m = div m 1000000  
    toMin :: MuSecX -> MinX  
    toMin m = div m 60000000  

I'm obviously on the wrong path here. I'm sure I'm not the first to try to do something like this, so can anyone help, preferably with a "Canonical Haskell Way"?

like image 758
Brent.Longborough Avatar asked Dec 11 '22 00:12

Brent.Longborough


1 Answers

Type synonyms won't protect you from mixing types, that's not what they're for. They are literally just different names for the same types. They are used for convenience and/or for documentation. But SecX and Integer are still the very same type.

In order to create a completely new type, use newtype:

newtype SecX = SecX Integer

As you can see, the type now has a constructor, which can be used to construct new values of the type, as well as to get the Integer out of them by pattern-matching:

let x = SecX 20 
let (SecX a) = x  -- here, a == 20

Similar with MinX:

newtype MinX = MinX Integer

And the conversion functions would look like this:

toMin :: SecX -> MinX
toMin (SecX m) = MinX $ div m 60

toSec :: MinX -> SecX
toSec (MinX m) = SecX $ 60 * m

And now the line indeed won't compile

let x = MinX 20
let y = SecX 20 
let z = x + y       -- does not compile

But wait! This also doesn't compile anymore:

let sec1 = SecX 20
let sec2 = SecX 20 
let sec3 = sec1 + sec2       -- does not compile either

What's going on? Well, sec1 and sec2 are no longer just Integers (which was the whole point of the exercise), and so the function (+) is not defined for them.

But you can define it: function (+) comes from the type class Num, so in order for SecX to support this function, SecX needs to have an instance of Num as well:

instance Num SecX where
    (SecX a) + (SecX b) = SecX $ a + b
    (SecX a) * (SecX b) = SecX $ a * b
    abs (SecX a) = ...
    signum (SecX a) = ...
    fromInteger i = ...
    negate (SecX a) = ...

Wow, that's a lot to implement! Plus, what does it even mean to multiply seconds? That's a bit awkward, isn't it? Well, this is because the class Num is literally for numbers. It's expected that its instances really behave like numbers. It doesn't quite make sense for seconds, since although you can add them, other operations don't really make a lot of sense.

A better thing to implement for seconds is Semigroup (or perhaps even Monoid). Semigroup has a single operation <>, whose semantics is "glue two of these things together and get another one of the same kind of thing in return", which works very well for seconds:

instance Semigroup SecX where
    (SecX a) <> (SecX b) = SecX $ a + b

And now this will compile:

let sec1 = SecX 20
let sec2 = SecX 20 
let sec3 = sec1 <> sec2       -- compiles now, and sec3 == SecX 40

Similarly for minutes:

instance Semigroup MinX where
    (MinX a) <> (MinX b) = MinX $ a + b

But wait! We're still in trouble! Now print [x, y, z] doesn't compile anymore.

Well, first reason it doesn't compile is that the list [x, y, z] now contains elements of different types, which cannot happen. But ok, since it's just for testing, we can do print x and then print y, no matter.

But that still wouldn't compile, because the function print requires that its argument has an instance of class Show - that's where the function show lives, which is what is used to convert the value to string for printing.

And of course, we can implement that for our types:

class Show SecX where
    show (SecX a) = show a <> " seconds"

class Show MinX where
    show (MinX a) = show a <> " minutes"

Or, alternatively, we can have the compiler automatically derive the instances for us:

newtype SecX = SecX Integer deriving Show
newtype MinX = MinX Integer deriving Show

But in this case show (SecX 42) == "SecX 42" (or maybe just "42" depending on extensions enabled), whereas with my manual implementation above show (SecX 42) == "42 seconds". Your call.


Phew! Now we can finally move on to the second question: conversion functions.

The usual, "base" approach is to just have different names for different functions:

minToSec :: MinX -> SecX
secToMin :: SecX -> MinX
minToMusec :: MinX -> MuSecX
secToMusec :: SecX -> MuSecX
... and so on

But if you really insist on keeping the same name for the functions, while having them work with different parameter types, that is possible too. More generally, this is called "overloading", and in Haskell the mechanism for creating overloaded functions is our old friend type class. Look above: we already defined function (<>) for different types. We can just make our own type class for this:

class TimeConversions a where
    toSec :: a -> SecX
    toMin :: a -> MinX
    toMuSec :: a -> MuSecX

And then add its implementations:

instance TimeConversions SecX where
    toSec = id
    toMin (SecX a) = MinX $ a `div` 60
    toMuSec (SecX a) = MuSecX $ a * 1000000

And similarly for minutes and microseconds.

Usage:

main = do
    let x = SecX 20
    let y = SecX 30
    let a = MinX 5
    let z = x <> y
    -- let u = x <> a  -- doesn't compile
    let v = x <> toSec a

    print [x, y, v]   -- ["20 seconds", "30 seconds", "320 seconds"]
    print a           -- "5 minutes"
    print (toMin x)   -- "0 minutes"
    print (toSec a)   -- "300 seconds"

Finally: don't use Integer, use Int. Integer is arbitrary precision, which means it's also slower. Int is 32- or 64-bit value (depending on the platform), which should be enough for your purposes I think.

But for a real implementation, I would actually suggest floating-point numbers in the first place (e.g. Double). This would make conversions fully reversible and lossless. With integers, toMin (SecX 20) == MinX 0 - we just lost some information.


like image 78
Fyodor Soikin Avatar answered Mar 08 '23 09:03

Fyodor Soikin