This is a question about how to do something in Haskell, that is easy to do in Python3.
I have a Python3 program that uses generators, as follows:
def gen(filename):
for line in open(filename):
line = line.rstrip()
print(f"From {filename} about to yield the line {line}")
yield line
print(f"From {filename} recently yielded the line {line}")
if "." in line:
yield from gen(line)
for line in gen("a.txt"):
print(f"Main program has {line}")
If I give it an input file a.txt containing the following:
First line of a
b.txt
Third line of a
and another input file b.txt containing the following:
First line of b
Second line of b
then the output, as I would expect, is:
From a.txt about to yield the line First line of a
Main program has First line of a
From a.txt recently yielded the line First line of a
From a.txt about to yield the line b.txt
Main program has b.txt
From a.txt recently yielded the line b.txt
From b.txt about to yield the line First line of b
Main program has First line of b
From b.txt recently yielded the line First line of b
From b.txt about to yield the line Second line of b
Main program has Second line of b
From b.txt recently yielded the line Second line of b
From a.txt about to yield the line Third line of a
Main program has Third line of a
From a.txt recently yielded the line Third line of a
I would very much like to be able to do the same thing in Haskell. I need to preserve the basic structure of the Python version, keeping gen() as a function (or semi-coroutine, if you want to call it that) that I can call from various different places.
In my attempts to write it in Haskell, I am getting in a dreadful mess with types like:
IO [IO String]
and I suspect that I'm going about it the wrong way.
Any suggestions?
The data type you're looking for is FreeT
:
data FreeF f a b = Pure a | Free (f b)
newtype FreeT f m a = FreeT { runFreeT :: m (FreeF f a (FreeT f m a)) }
FreeT f m a
represents "alternating layers of m
followed by f
, except at any point, instead of a f
-layer, there can be a terminating a
-value". The specific form of this type that equates to your generators is
type Generator a = FreeT ((,) a) IO ()
A Generator a
is alternating layers of IO
-computation and a
-production, except one of the IO
-computations can terminate generation by producing a ()
instead.
instance (Functor f, Monad m) => Monad (FreeT f m)
instance MonadTrans (FreeT f)
lift :: (MonadTrans t, Monad m) => m a -> t m a
lift :: Monad m => m a -> FreeT f m a
liftF :: (Functor f, MonadFree f m) => f a -> m a
liftF :: (Functor f, Monad m) => f a -> FreeT f m a
So you can write gen
in this manner:
-- singleton generator
yield :: a -> Generator a
yield x = liftF (x, ())
-- empty generator (unused/part of when)
end :: Generator a
end = return ()
gen :: FilePath -> Generator String
gen path = do handle <- lift $ openFile path
fileLines <- lift $ lines <$> hGetContents handle
forM_ fileLines $ \line -> do
lift $ putStrLn $ "From " ++ path ++ " about to yield the line " ++ line
yield line
lift $ putStrLn $ "From " ++ path ++ " recently yielded the line " ++ line
when ('.' `elem` line) $ gen line
lift $ hClose handle
You can then tear this structure back down:
main = iterT (\(line, continue) -> putStrLn ("Main program has " ++ line) >> continue)
(gen "a.txt")
FreeT
has a cousin, named Stream
. This is pretty much the same thing, except it uses some tricks to squeeze some more performance out. (The cost is that equality is now fuzzier than before, but we don't usually care about equality of control structures anyway.) Specifically, instead of alternating m
and f
layers, Stream
is just a sequence of m
and f
layers in whatever order they're created in. As long as Monad m
, this is fine, as adjacent m
layers can always be join
ed together and new pure
ones formed between adjacent f
-layers, until the alternating structure of FreeT
is recovered.
If you use Stream
(which you probably should), then you may want to replace (,)
with Of
, from the same package:
data Of a b = !a :> b
type Generator a = Stream (Of a) IO ()
The strictness isn't particularly useful with String
, but it does read better. It's more useful for other types, like Int
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With