Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hGetContents being too lazy

I have the following snippet of code, which I pass to withFile:

text <- hGetContents hand 
let code = parseCode text
return code

Here hand is a valid file handle, opened with ReadMode and parseCode is my own function that reads the input and returns a Maybe. As it is, the function fails and returns Nothing. If, instead I write:

text <- hGetContents hand 
putStrLn text
let code = parseCode text
return code

I get a Just, as I should.

If I do openFile and hClose myself, I have the same problem. Why is this happening? How can I cleanly solve it?

Thanks

like image 397
Karolis Juodelė Avatar asked May 07 '12 16:05

Karolis Juodelė


1 Answers

hGetContents isn't too lazy, it just needs to be composed with other things appropriately to get the desired effect. Maybe the situation would be clearer if it were were renamed exposeContentsToEvaluationAsNeededForTheRestOfTheAction or just listen.

withFile opens the file, does something (or nothing, as you please -- exactly what you require of it in any case), and closes the file.

It will hardly suffice to bring out all the mysteries of 'lazy IO', but consider now this difference in bracketing

 good file operation = withFile file ReadMode (hGetContents >=> operation >=> print)
 bad file operation = (withFile file ReadMode hGetContents) >>= operation >>= print

-- *Main> good "lazyio.hs" (return . length)
-- 503
-- *Main> bad "lazyio.hs" (return . length)
-- 0

Crudely put, bad opens and closes the file before it does anything; good does everything in between opening and closing the file. Your first action was akin to bad. withFile should govern all of the action you want done that that depends on the handle.

You don't need a strictness enforcer if you are working with String, small files, etc., just an idea how the composition works. Again, in bad all I 'do' before closing the file is exposeContentsToEvaluationAsNeededForTheRestOfTheAction. In good I compose exposeContentsToEvaluationAsNeededForTheRestOfTheAction with the rest of the action I have in mind, then close the file.

The familiar length + seq trick mentioned by Patrick, or length + evaluate is worth knowing; your second action with putStrLn txt was a variant. But reorganization is better, unless lazy IO is wrong for your case.

$ time ./bad
bad: Prelude.last: empty list  
                        -- no, lots of Chars there
real    0m0.087s

$ time ./good
'\n'                -- right
()
real    0m15.977s

$ time ./seqing 
Killed               -- hopeless, attempting to represent the file contents
    real    1m54.065s    -- in memory as a linked list, before finding out the last char

It goes without saying that ByteString and Text are worth knowing about, but reorganization with evaluation in mind is better, since even with them the Lazy variants are often what you need, and they then involve grasping the same distinctions between forms of composition. If you are dealing with one of the (immense) class of cases where this sort of IO is inappropriate, take a look at enumerator, conduit and co., all wonderful.

like image 63
applicative Avatar answered Oct 17 '22 20:10

applicative