Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Statically enforcing that two objects were created from the same (Int) "seed"

Tags:

types

haskell

ghc

In a library I'm working on, I have an API similar to the following:

data Collection a = Collection Seed {-etc...-}
type Seed = Int

newCollection :: Seed -> IO (Collection a)
newCollection = undefined

insert :: a -> Collection a -> IO () -- ...and other mutable set-like functions
insert = undefined

mergeCollections :: Collection a -> Collection a -> IO (Collection a)
mergeCollections (Collection s0 {-etc...-}) (Collection s1 {-etc...-}) 
  | s0 /= s1  = error "This is invalid; how can we make it statically unreachable?"
  | otherwise = undefined

I'd like to be able to enforce that the user cannot call mergeCollections on Collections created with different Seed values.

I thought of trying to tag Collection with a type-level natural: I think this would mean that the Seed would have to be statically known at compile time, but my users might be getting it from an environment variable or user input, so I don't think that would work.

I also hoped I might be able to do something like:

newtype Seed u = Seed Int
newSeed :: Int -> Seed u
newCollection :: Seed u -> IO (Collection u a)
mergeCollections :: Collection u a -> Collection u a -> IO (Collection u a)

Where somehow a Seed is tagged with a unique type in some way, such that the type system could track that both arguments to merge were created from the seed returned by the same invocation of newSeed. To be clear in this (hand-wavy) scheme a and b here would somehow not unify: let a = newSeed 1; b = newSeed 1;.

Is something like this possible?

Examples

Here are some examples of ways I can imagine users creating Seeds and Collections. Users would like to use the other operations (inserting, merging, etc) as freely as they could with any other IO mutable collection:

  1. We only need one seed for all Collections (dynamically) created during the program, but the user must be able to specify in some way how the seed should be determined from the environment at runtime.

  2. One or more static keys gathered from environment vars (or command line args):

    main = do
       s1 <- getEnv "SEED1"
       s2 <- getEnv "SEED2"
       -- ... many Collections may be created dynamically from these seeds
       -- and dynamically merged later
    
like image 908
jberryman Avatar asked Feb 07 '23 09:02

jberryman


1 Answers

Probably not in a convenient way. For handling seeds that are known only at runtime, you can use existential types; but then you cannot statically check that two of these existentially wrapped collections match up. The much simpler solution is simply this:

merge :: Collection a -> Collection a -> IO (Maybe (Collection a))

On the other hand, if it is okay to force all operations to be done "together", in a sense, then you can do something like what the ST monad does: group all the operations together, then supply an operation for "running" all the operations that only works if the operations don't leak collections by demanding they be perfectly polymorphic over a phantom variable, hence that the return type doesn't mention the phantom variable. (Tikhon Jelvis also suggests this in his comments.) Here's how that might look:

{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Collection (Collection, COp, newCollection, merge, inspect, runCOp) where

import Control.Monad.Reader

type Seed = Int
data Collection s a = Collection Seed
newtype COp s a = COp (Seed -> a) deriving (Functor, Applicative, Monad, MonadReader Seed)

newCollection :: COp s (Collection s a)
newCollection = Collection <$> ask

merge :: Collection s a -> Collection s a -> COp s (Collection s a)
merge l r = return (whatever l r) where
  whatever = const

-- just an example; substitute whatever functions you want to have for
-- consuming Collections
inspect :: Collection s a -> COp s Int
inspect (Collection seed) = return seed

runCOp :: (forall s. COp s a) -> Seed -> a
runCOp (COp f) = f

Note in particular that the COp and Collection constructors are not exported. Consequently we need never fear that a Collection will escape its COp; runCOp newCollection is not well-typed (and any other operation that tries to "leak" the collection to the outside world will have the same property). Therefore it will not be possible to pass a Collection constructed with one seed to a merge operating in the context of another seed.

like image 151
Daniel Wagner Avatar answered Feb 08 '23 22:02

Daniel Wagner