Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memoizing IO Computations in Haskell

I have a function

f :: MonadIO m => a -> m b

which takes some input and returns an IO computation that will yield output. I want to "memoize" f so that I only ever perform these computations once for each input. For example, if

f :: String -> IO String
f s = putStrLn ("hello " ++ s) >> return s

then I want a function memoize such that

do
  mf <- memoize f
  s <- mf "world"
  t <- mf "world"
  return (s,t)

prints "hello world" exactly once and returns ("world", "world"). The program I'm writing is multi-threaded, so this property should hold even if different threads are all calling mf.

Below is the (terrible) solution I've come up with so far. My question is whether and how it can be improved.

memoize :: (MonadIO m, Ord a) => (a -> m b) -> m (a -> m b)
memoize f = do
  cache <- liftIO $ newTVarIO Map.empty
  return $ \a -> do
              v <- liftIO $ atomically $ lookupInsert cache a
              b <- maybe (f a) return =<< liftIO (atomically $ takeTMVar v)
              liftIO $ atomically $ putTMVar v $ Just b
              return b
    where
      lookupInsert :: Ord a => TVar (Map a (TMVar (Maybe b))) -> a -> STM (TMVar (Maybe b))
      lookupInsert cache a = do
                         mv <- Map.lookup a <$> readTVar cache
                         case mv of
                           Just v -> return v
                           Nothing -> do
                                   v <- newTMVar Nothing
                                   modifyTVar cache (Map.insert a v)
                                   return v

There are a few things going on here:

1) cache has type TVar (Map a (TMVar (Maybe b))). It maps inputs to TMVar's that contain either a computed value, or Nothing (which indicates that a value hasn't been computed yet). The function lookupInsert inspects cache, and inserts a new TMVar initialized to Nothing if none is present already.

2) The returned action first obtains the v :: TMVar (Maybe b) associated to a, then takes it, and either performs the computation f a to obtain a result or returns the value stored in the Maybe if it's available. This take and put pattern is so that two different threads don't both run the computation f a after seeing that it hasn't been run yet.

like image 886
davidsd Avatar asked Nov 13 '22 07:11

davidsd


1 Answers

I thought what you wanted was impossible, but turns out it is.

https://stackoverflow.com/a/9458721/1798971

I still can't figure out why that works!

like image 179
Raahul Kumar Avatar answered Dec 19 '22 13:12

Raahul Kumar