Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Holding" a Data Map in memory

I have three data structures defined as such, where S, LL, M, and Object, represent Set, ListLike, Map, and ByteString, respectively:

nouns :: IO [Object]
nouns = liftM LL.words $ B.readFile "nounlist.txt"

obj :: IO ObjectSet
obj =  liftM S.fromList nouns

actions :: IO ActionMap
actions = do
  n <- nouns
  let l = foldl' (\z x -> (x,Sell):(x,Create):z) [] n
  return $ M.fromList $
    (\(x,y) -> ((x, Verb y []), Out (Verb y []) x)) <$> l

Now I have one function that binds the unevaluated Set and Map to variables a and o. Once it enters query, an infinite loop of queries are accepted via user-input and processed. Appropriate responses are generated via lookups.

process :: IO ()
process = do
  a <- actions
  o <- obj
  forever $ query "" a o

Keeping in mind that my Map is composed of 300,000+ key-value pairs: The initial temporal overhead of the first evaluation when the first query is called is between approximately 3-5 seconds, on my computer; this is fine and completely expected. Every other subsequent call is snappy and responsive, just the way I want it. However, this is only so because I am running this code as a standalone executable and have the luxury of staying within the IO () of process. If I were to turn this code (and the rest of the accompanying code not listed) into a library to interface with say .. a Snap Framework Web Application, I wouldn't necessarily have this luxury. Essentially what I am trying to say is: If I were to remove the forever from process then the evaluated Map and Set would surely get garbage-collected. Indeed this is what happens when I call the function from a Snap Application (I can't keep forever because it will block the Snap Application). Every subsequent call from the Snap Application will have the same 3-5 second overhead because it re-evaluates the data structures in question.


My Question:

Is there an easy way to hold the Map and Set in memory so that every subsequent lookup is fast? One Idea I came up with was to run a thread that sleeps and maintains storage for the Map and Set. However, this definitely seems like overkill to me. What am I overlooking? Thank you for bearing with my long-winded explanation.

Note: I'm not necessarily looking for code answers, moreso suggestions, advice, etc.

like image 513
eazar001 Avatar asked Apr 01 '26 06:04

eazar001


2 Answers

You can evaluate obj and actions only once during snaplet initialization and store result in snaplet's state.

data SnapApp = SnapApp
    { objectSet :: ObjectSet
    , actionMap :: ActionMap
    }

appInit :: SnapletInit SnapApp SnapApp
appInit = makeSnaplet ... $ do
    ... 
    a <- liftIO actions
    o <- liftIO obj
    return $ SnapApp o a

Now you can access them from snap's Handler:

someUrlHandler :: Handler SnapApp SnapApp
someUrlHandler = do
  a <- gets actionMap
  o <- gets objectMap
  res <- query a o
  ...

This guarantees that actions and obj will be evaluated only once.

like image 89
max taldykin Avatar answered Apr 04 '26 10:04

max taldykin


Here is what I was think of doing with IORef:

import Data.IORef
import System.IO.Unsafe 
import Control.Monad 

val_ :: IORef (Maybe Integer)
val_ = unsafePerformIO $ newIORef Nothing

val :: IO Integer
val = do 
  v <- readIORef val_
  case v of 
    Just v' -> return v' 
    Nothing -> do
           v' <- readFile "large.txt" 
           -- replace this part with your actual computation
           let l = sum $ map (fromIntegral . fromEnum) v' 
           writeIORef val_ $ Just l
           return l 

main = do 
  writeFile "large.txt" (replicate (10^7) '0')
  putStrLn "reading"
  replicateM_ 10 (val >>= print)

You ensure that the time consuming operation is only ever evaluated once. When you execute val the first time, it will write the value to the IORef and retrieve it from there every subsequent time. When I ran main, it will take a few seconds to print the number the first time and no time at all afterwards.

You need to have unsafePerformIO because IORef x can't be garbage collected but IO (IORef x) will be.

Keep in mind that writing to the IORef does not evaluate anything, it will be evaluated the first time it is used, even if you call val earlier.

The simpler solution is probably to use monad transformers. You didn't provide an example of where in your snap program this table will be used, so I can't really give a satisfactory example.

like image 28
user2407038 Avatar answered Apr 04 '26 11:04

user2407038



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!