Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(Type) safely retrieving an element of a vector

Tags:

haskell

I am attempting to write a function

getColumn :: Int -> Vector n e -> e

that retrieves the i'th item from a vector of size n.

These vectors are defined as follows:

data Natural where
  Zero :: Natural
  Succ :: Natural -> Natural

type One   = Succ Zero
type Two   = Succ One
type Three = Succ Two
type Four  = Succ Three

data Vector n e where
  Nil  :: Vector Zero e
  (:|) :: e -> Vector n e -> Vector (Succ n) e

infixr :|

Is there some way to write the getColumn function in a way where the compiler will reject the code if the Int is too big for the Vector's size?

How would I write this function?

like image 847
sdasdadas Avatar asked May 30 '14 05:05

sdasdadas


1 Answers

First, we need a singleton type for the naturals. Singletons are run-time representations of type-level data, and they're called singleton types because each of them has only a single value. This is useful because it establishes a one-to-one correspondence between the values and the represented type; just knowing either the type or the value lets us infer the other.

This also lets us circumvent the limitation that Haskell types cannot depend on Haskell values: our types will depend on the type index of a singleton, but that type index can be in turn determined from the value of the singleton. This somewhat tortuous detour is not present in fully dependent programming languages such as Agda or Idris, where types can depend on values.

data SNatural (n :: Natural) where
    SZero :: SNatural Zero
    SSucc :: SNatural n -> SNatural (Succ n) 

deriving instance Show (SNatural n) -- requires StandaloneDeriving

We can see that for any n, SNatural n has only a single possible value; it just mirrors the original Natural definition.

There are multiple ways we can proceed towards vector indexing.

1. Defining the < constraint directly.

Defining < on naturals is straightforward:

{-# LANGUAGE TypeOperators, TypeFamilies #-}

type family a < b where 
    Zero   < Succ b = True
    a      < Zero   = False
    Succ a < Succ b = a < b  

Now we can express boundedness with a type equality constraint:

index :: ((m < n) ~ True) => Vector n a -> SNatural m -> a
index (x :| xs) SZero     = x 
index (x :| xs) (SSucc i) = index xs i
index _         _         = error "impossible"

main = do
    print $ index (0 :| 1 :| Nil) SZero -- 0
    print $ index (0 :| 1 :| Nil) (SSucc (SSucc SZero)) -- type error 

2. Using an alternative representation of naturals with baked-in bounds.

This is the standard solution in dependently typed programming, and IMO the simpler one, although it's a bit more difficult to understand at first. We call the bounded natural type Fin n (for "finite"), where n is a natural representing the upper bound. The trick is to index our number in a way such that the size of the value can't be greater than the index.

data Fin (n :: Natural) where
    FZero :: Fin (Succ n)
    FSucc :: Fin n -> Fin (Succ n) 

It's apparent that Fin Zero doesn't have any values. Fin (Succ Zero) has a single value, FZero, Fin (Succ (Succ Zero)) has two values, and thus Fin n always has n possible values. We can use it for safe indexing straightforwardly:

index :: Vector n a -> Fin n -> a
index (x :| xs) FZero     = x
index (x :| xs) (FSucc i) = index xs i
index _         _         = error "impossible" 

main = do
    print $ index (0 :| 1 :| Nil) (FSucc (FSucc FZero)) -- type error

Addendum: using the singletons library and doing safe indexing by Int-s.

As we've seen, doing dependent programming in Haskell involves a hefty amount of boilerplate. The type-level data and their singletons are more or less identical, yet still need separate definitions. The functions operating on them have to be duplicated similarly. Luckily, the singletons package can generate the boilerplate for us:

{-# LANGUAGE 
    TypeFamilies, GADTs, DataKinds, PolyKinds,
    ScopedTypeVariables, TemplateHaskell #-}

import Data.Singletons.TH

-- We get the "SNat n" singleton generated too. 
$(singletons[d| data Nat = Z | S Nat |])

data Vector n e where
  Nil  :: Vector Z e
  (:|) :: e -> Vector n e -> Vector (S n) e
infixr :|

data Fin n where
    FZ :: Fin (S n)
    FS :: Fin n -> Fin (S n) 

index :: Vector n a -> Fin n -> a
index (x :| xs) FZ     = x
index (x :| xs) (FS i) = index xs i
index _         _      = error "impossible"

The package also includes convenient ways to generate singleton values from types as needed:

foo :: SNat (S (S (S Z)))
foo = sing

sing is a polymorphic value that may serve as a stand-in for any singleton value. Sometimes the correct value can be inferred from the context, but other times we have to annotate its type index, usually using the ScopedTypeVariables extension.

Now we can also index safely by Int-s without being overmuch bothered by boilerplate (not a catastrophic amount of boilerplate though; implementing sing for Nat by hand would require one more typeclass and a couple of instances).

In general, verified programming is not about validating data compile time (as we've seen in the examples above), but rather about writing functions that operate in a provably correct way, even on data that is unknown compile-time (you could say that it's irrelevant when we validate the data, provided that the validation functions are correct). Our index can viewed as a semi-verified function, because it is impossible to implement an error-throwing version of it that typechecks (modulo bottoms and divergence).

To safely index by Int-s, we just have to write a verified conversion function from Int to Fin, and then use index as usual:

checkBound :: Int -> SNat n -> Maybe (Fin n)
checkBound i _ | i < 0 = Nothing
checkBound 0 (SS _)    = Just FZ
checkBound i SZ        = Nothing
checkBound i (SS n)    = case checkBound (i - 1) n of
    Just n -> Just (FS n)
    Nothing -> Nothing

Again, the magic of checkBound is that it is impossible to write a definition that returns a Fin that violates the given bound.

indexInt :: forall n a . SingI n => Vector n a -> Int -> Maybe a
indexInt v i = case checkBound i (sing :: SNat n) of
    Just i  -> Just (index v i)
    Nothing -> Nothing

Here we need to utilize some singletons machinery: the SingI constraint allows us to conjure up an appropriate singleton value using sing. It's a harmless class constraint, because every possible n is an instance of SingI, by construction.

like image 98
András Kovács Avatar answered Oct 18 '22 12:10

András Kovács