I am attempting to write a function
getColumn :: Int -> Vector n e -> e
that retrieves the i'th item from a vector of size n
.
These vectors are defined as follows:
data Natural where
Zero :: Natural
Succ :: Natural -> Natural
type One = Succ Zero
type Two = Succ One
type Three = Succ Two
type Four = Succ Three
data Vector n e where
Nil :: Vector Zero e
(:|) :: e -> Vector n e -> Vector (Succ n) e
infixr :|
Is there some way to write the getColumn
function in a way where the compiler will reject the code if the Int
is too big for the Vector
's size?
How would I write this function?
First, we need a singleton type for the naturals. Singletons are run-time representations of type-level data, and they're called singleton types because each of them has only a single value. This is useful because it establishes a one-to-one correspondence between the values and the represented type; just knowing either the type or the value lets us infer the other.
This also lets us circumvent the limitation that Haskell types cannot depend on Haskell values: our types will depend on the type index of a singleton, but that type index can be in turn determined from the value of the singleton. This somewhat tortuous detour is not present in fully dependent programming languages such as Agda or Idris, where types can depend on values.
data SNatural (n :: Natural) where
SZero :: SNatural Zero
SSucc :: SNatural n -> SNatural (Succ n)
deriving instance Show (SNatural n) -- requires StandaloneDeriving
We can see that for any n
, SNatural n
has only a single possible value; it just mirrors the original Natural
definition.
There are multiple ways we can proceed towards vector indexing.
<
constraint directly.Defining <
on naturals is straightforward:
{-# LANGUAGE TypeOperators, TypeFamilies #-}
type family a < b where
Zero < Succ b = True
a < Zero = False
Succ a < Succ b = a < b
Now we can express boundedness with a type equality constraint:
index :: ((m < n) ~ True) => Vector n a -> SNatural m -> a
index (x :| xs) SZero = x
index (x :| xs) (SSucc i) = index xs i
index _ _ = error "impossible"
main = do
print $ index (0 :| 1 :| Nil) SZero -- 0
print $ index (0 :| 1 :| Nil) (SSucc (SSucc SZero)) -- type error
This is the standard solution in dependently typed programming, and IMO the simpler one, although it's a bit more difficult to understand at first. We call the bounded natural type Fin n
(for "finite"), where n
is a natural representing the upper bound. The trick is to index our number in a way such that the size of the value can't be greater than the index.
data Fin (n :: Natural) where
FZero :: Fin (Succ n)
FSucc :: Fin n -> Fin (Succ n)
It's apparent that Fin Zero
doesn't have any values. Fin (Succ Zero)
has a single value, FZero
, Fin (Succ (Succ Zero))
has two values, and thus Fin n
always has n
possible values. We can use it for safe indexing straightforwardly:
index :: Vector n a -> Fin n -> a
index (x :| xs) FZero = x
index (x :| xs) (FSucc i) = index xs i
index _ _ = error "impossible"
main = do
print $ index (0 :| 1 :| Nil) (FSucc (FSucc FZero)) -- type error
singletons
library and doing safe indexing by Int
-s.As we've seen, doing dependent programming in Haskell involves a hefty amount of boilerplate. The type-level data and their singletons are more or less identical, yet still need separate definitions. The functions operating on them have to be duplicated similarly. Luckily, the singletons
package can generate the boilerplate for us:
{-# LANGUAGE
TypeFamilies, GADTs, DataKinds, PolyKinds,
ScopedTypeVariables, TemplateHaskell #-}
import Data.Singletons.TH
-- We get the "SNat n" singleton generated too.
$(singletons[d| data Nat = Z | S Nat |])
data Vector n e where
Nil :: Vector Z e
(:|) :: e -> Vector n e -> Vector (S n) e
infixr :|
data Fin n where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
index :: Vector n a -> Fin n -> a
index (x :| xs) FZ = x
index (x :| xs) (FS i) = index xs i
index _ _ = error "impossible"
The package also includes convenient ways to generate singleton values from types as needed:
foo :: SNat (S (S (S Z)))
foo = sing
sing
is a polymorphic value that may serve as a stand-in for any singleton value. Sometimes the correct value can be inferred from the context, but other times we have to annotate its type index, usually using the ScopedTypeVariables extension.
Now we can also index safely by Int
-s without being overmuch bothered by boilerplate (not a catastrophic amount of boilerplate though; implementing sing
for Nat
by hand would require one more typeclass and a couple of instances).
In general, verified programming is not about validating data compile time (as we've seen in the examples above), but rather about writing functions that operate in a provably correct way, even on data that is unknown compile-time (you could say that it's irrelevant when we validate the data, provided that the validation functions are correct). Our index
can viewed as a semi-verified function, because it is impossible to implement an error-throwing version of it that typechecks (modulo bottoms and divergence).
To safely index by Int
-s, we just have to write a verified conversion function from Int
to Fin
, and then use index
as usual:
checkBound :: Int -> SNat n -> Maybe (Fin n)
checkBound i _ | i < 0 = Nothing
checkBound 0 (SS _) = Just FZ
checkBound i SZ = Nothing
checkBound i (SS n) = case checkBound (i - 1) n of
Just n -> Just (FS n)
Nothing -> Nothing
Again, the magic of checkBound
is that it is impossible to write a definition that returns a Fin
that violates the given bound.
indexInt :: forall n a . SingI n => Vector n a -> Int -> Maybe a
indexInt v i = case checkBound i (sing :: SNat n) of
Just i -> Just (index v i)
Nothing -> Nothing
Here we need to utilize some singletons
machinery: the SingI
constraint allows us to conjure up an appropriate singleton value using sing
. It's a harmless class constraint, because every possible n
is an instance of SingI
, by construction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With