Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell IO and closing files

When I open a file for reading in Haskell, I've found that I can't use the contents of the file after closing it. For example, this program will print the contents of a file:

main = do inFile <- openFile "foo" ReadMode           contents <- hGetContents inFile           putStr contents           hClose inFile 

I expected that interchanging the putStr line with the hClose line would have no effect, but this program prints nothing:

main = do inFile <- openFile "foo" ReadMode           contents <- hGetContents inFile           hClose inFile           putStr contents 

Why does this happen? I'm guessing it has something to do with lazy evaluation, but I thought these expressions would get sequenced so there wouldn't be a problem. How would you implement a function like readFile?

like image 909
Jay Conrod Avatar asked Nov 17 '08 20:11

Jay Conrod


1 Answers

As others have stated, it is because of lazy evaluation. The handle is half-closed after this operation, and will be closed automatically when all data is read. Both hGetContents and readFile are lazy in this way. In cases where you're having issues with handles being kept open, typically you just force the read. Here's the easy way:

import Control.Parallel.Strategies (rnf) -- rnf means "reduce to normal form" main = do inFile <- openFile "foo"            contents <- hGetContents inFile           rnf contents `seq` hClose inFile -- force the whole file to be read, then close           putStr contents 

These days, however, nobody is using strings for file I/O anymore. The new way is to use Data.ByteString (available on hackage), and Data.ByteString.Lazy when you want lazy reads.

import qualified Data.ByteString as Str  main = do contents <- Str.readFile "foo"           -- readFile is strict, so the the entire string is read here           Str.putStr contents 

ByteStrings are the way to go for big strings (like file contents). They are much faster and more memory efficient than String (= [Char]).

Notes:

I imported rnf from Control.Parallel.Strategies only for convenience. You could write something like it yourself pretty easily:

  forceList [] = ()   forceList (x:xs) = forceList xs 

This just forces a traversal of the spine (not the values) of the list, which would have the effect of reading the whole file.

Lazy I/O is becoming considered evil by experts; I recommend using strict bytestrings for most of file I/O for the time being. There are a few solutions in the oven which attempt to bring back composable incremental reads, the most promising of which is called "Iteratee" by Oleg.

like image 104
luqui Avatar answered Oct 05 '22 08:10

luqui