Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"inject" progress logging/tracing in haskell computation?

I'm picking a specific task to illustrate what I was talking about

Let's say I wanted to find the sum of all the factors of a large number, naively -- by checking every number below it if it was a factor, then adding them together.

In an imperative programming language with no separation between IO and pure computations, you might do something like this

def sum_of_factors(n):
  sum = 0
  for i between 1 and n:
    if (n % i == 0):
      sum += i
  return sum

However if my n is large, I'd end up staring at an empty screen for a long time before the computation finishes. So I add some logging --

def sum_of_factors(n):
  sum = 0
  for i between 1 and n:
    if (i % 1000 == 0):
      print "checking $i..."
    if (n % i == 0):
      print "found factor $i"
      sum += 1
  return sum

and really, this addition is trivial.

Now, if I were to do this in textbook haskell i might do

sum_of_factors :: Int -> Int
sum_of_factors n = foldl' (+) 0 factors
  where
    factors = filter ((== 0) . (mod n)) [1..n]

I run into the same problem as before...for large numbers, I just stare at a blank screen for a while.

But I can't figure out how to inject the same kind of tracing/logging in the Haskell code. i'm not sure, other than maybe re-implementing fold with explicit recursion, to get the same tracing pattern/result as in the imperative impure code.

Is there a faculty in Haskell to make this doable? One that doesn't require refactoring everything?

Thank you

like image 960
Justin L. Avatar asked Jun 25 '13 06:06

Justin L.


2 Answers

There is a number of possible solutions.

The simplest one is to alter your function to return stream of events instead of the final result. You sum_of_factors doesn't compile for me, so I'll use a sum function as an example. The idea is to send Left message to show progress, and send Right result when done. Thanks to lazy evaluation, you'll see progress events while the function is working:

import Control.Monad

sum' :: [Int] -> [Either String Int]
sum' = go step 0
  where
  step = 10000
  go _ res [] = [Right res]
  go 0 res (x:xs) = Left ("progress: " ++ show x) : go step (res + x) xs
  go c res (x:xs) = go (c - 1) (res + x) xs

main :: IO ()
main = do
  forM_ (sum' [1..1000000]) $ \event ->
    case event of
      Right res -> putStrLn $ "Result: " ++ show res
      Left str -> putStrLn str

Other (and better from my point of view) solution is to make the function monadic:

class Monad m => LogM m where
  logMe :: String -> m ()

instance LogM IO where
  logMe = putStrLn

sum' :: LogM m => [Int] -> m Int
sum' = go step 0
  where
  step = 10000
  go _ res [] = return res
  go 0 res (x:xs) = logMe ("progress: " ++ show x) >> go step (res + x) xs
  go c res (x:xs) = go (c - 1) (res + x) xs

main :: IO ()
main = sum' [1..1000000] >>= print

or using foldM:

import Control.Monad

sum' :: LogM m => [Int] -> m Int
sum' = liftM snd . foldM go (0, 0)
  where
    step = 10000
    -- `!` forces evaluation and prevents build-up of thunks.
    -- See the BangPatterns language extension.
    go (!c, !res) x = do
        when (c == 0) $ logMe ("progress: " ++ show x)
        return $ ((c + 1) `mod` step, res + x)
like image 130
Yuras Avatar answered Nov 19 '22 00:11

Yuras


If you need quick and dirty logging, you can use Debug.Trace. It allows you to quickly add logging functions into pure code. (Of course, under the hood it uses unsafe stuff for that.) Be prepared that its logging output appears at different times than you expect (or not at all) - this is the consequence of adding impure debugging code into pure computations that are lazily evaluated.

Otherwise, you have to use monadic code, in order to properly sequence logging output. One of well developed libraries that uses IO is hslogger.

If you don't want to bind your code to IO (which is very sensible), Yuras' approach is the way to go. Create your own monad type class that describes your logging operations (perhaps with different levels etc.). Then, have one instance that produces logging output, like in the answer, and one instance that doesn't do anything, like

instance LogM Identity where
  logMe _ = return ()

Then, just by switching the monad you're working with, you turn loggin on/off, and the compiler optimizes the Identity monad away.

like image 23
Petr Avatar answered Nov 18 '22 22:11

Petr