Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Haskell, I want to read a file and then write to it. Do I need strictness annotation?

Still quite new to Haskell..

I want to read the contents of a file, do something with it possibly involving IO (using putStrLn for now) and then write new contents to the same file.

I came up with:

doit :: String -> IO ()
doit file = do
    contents <- withFile tagfile ReadMode $ \h -> hGetContents h
    putStrLn contents
    withFile tagfile WriteMode $ \h -> hPutStrLn h "new content"

However this doesn't work due to laziness. The file contents are not printed. I found this post which explains it well.

The solution proposed there is to include putStrLn within the withFile:

doit :: String -> IO ()
doit file = do
    withFile tagfile ReadMode $ \h -> do
        contents <- hGetContents h
        putStrLn contents
    withFile tagfile WriteMode $ \h -> hPutStrLn h "new content"

This works, but it's not what I want to do. The operation in I will eventually replace putStrLn might be long, I don't want to keep the file open the whole time. In general I just want to be able to get the file content out and then close it before working with that content.

The solution I came up with is the following:

doit :: String -> IO ()
doit file = do
    c <- newIORef ""
    withFile tagfile ReadMode $ \h -> do
        a <- hGetContents h
        writeIORef c $! a
    d <- readIORef c
    putStrLn d
    withFile tagfile WriteMode $ \h -> hPutStrLn h "Test"

However, I find this long and a bit obfuscated. I don't think I should need an IORef just to get a value out, but I needed "place" to put the file contents. Also, it still didn't work without the strictness annotation $! for writeIORef. I guess IORefs are not strict by nature?

Can anyone recommend a better, shorter way to do this while keeping my desired semantics?

Thanks!

like image 714
Steve Avatar asked Mar 26 '10 22:03

Steve


3 Answers

The reason your first program does not work is that withFile closes the file after executing the IO action passed to it. In your case, the IO action is hGetContents which does not read the file right away, but only as its contents are demanded. By the time you try to print the file's contents, withFile has already closed the file, so the read fails (silently).

You can fix this issue by not reinventing the wheel and simply using readFile and writeFile:

doit file = do
    contents <- readFile file
    putStrLn contents
    writeFile file "new content"

But suppose you want the new content to depend on the old content. Then you cannot, generally, simply do

doit file = do
    contents <- readFile file
    writeFile file $ process contents

because the writeFile may affect what the readFile returns (remember, it has not actually read the file yet). Or, depending on your operating system, you might not be able to open the same file for reading and writing on two separate handles. The simple but ugly workaround is

doit file = do
    contents <- readFile file
    length contents `seq` (writeFile file $ process contents)

which will force readFile to read the entire file and close it before the writeFile action can begin.

like image 162
Reid Barton Avatar answered Oct 12 '22 23:10

Reid Barton


I think the easiest way to solve this problem is useing strict IO:

import qualified System.IO.Strict as S
main = do
    file <- S.readFile "filename"
    writeFile "filename" file
like image 40
Spinno Avatar answered Oct 13 '22 01:10

Spinno


You can duplicate the file Handle, do lazy write with original one (to the end of file) and lazy read with another. So no strictness annotation involved in case of appending to file.

import System.IO
import GHC.IO.Handle

main :: IO ()
main = do
    h <- openFile "filename" ReadWriteMode
    h2 <- hDuplicate h

    hSeek h2 AbsoluteSeek 0
    originalFileContents <- hGetContents h2
    putStrLn originalFileContents

    hSeek h SeekFromEnd 0
    hPutStrLn h $ concatMap ("{new_contents}" ++) (lines originalFileContents)

    hClose h2
    hClose h

The hDuplicate function is provided by GHC.IO.Handle module.

Returns a duplicate of the original handle, with its own buffer. The two Handles will share a file pointer, however. The original handle's buffer is flushed, including discarding any input data, before the handle is duplicated.

With hSeek you can set position of the handle before reading or writing.

But I'm not sure how reliable would be using "AbsoluteSeek 0" instead of "SeekFromEnd 0" for writing, i.e. overwriting contents. Generally I would suggest to write to a temporary file first, for example using openTempFile (from System.IO), and then replace original.

like image 27
AleXoundOS Avatar answered Oct 12 '22 23:10

AleXoundOS