Say I'm going to open a file and parse its contents, and I want to do that lazily:
parseFile :: FilePath -> IO [SomeData]
parseFile path = openBinaryFile path ReadMode >>= parse' where
parse' handle = hIsEOF handle >>= \eof -> do
if eof then hClose handle >> return []
else do
first <- parseFirst handle
rest <- unsafeInterleaveIO $ parse' handle
return (first : rest)
The above code is fine if no error occurs during the whole reading process. But if an exception is thrown, there would be no chance to execute hClose
, and the handle won't be correctly closed.
Usually, if the IO process isn't lazy, exception handling could be easily solved by catch
or bracket
. However in this case normal exception handling methods will cause the file handle to be closed before the actual reading process starts. That of course not acceptable.
So what is the common way to release some resources that need to be kept out of its scope because of laziness, like what I'm doing, and still ensuring exception safety?
Instead of using openBinaryFile
, you could use withBinaryFile
:
parseFile :: FilePath -> ([SomeData] -> IO a) -> IO a
parseFile path f = withBinaryFile path ReadMode $ \h -> do
values <- parse' h
f values
where
parse' = ... -- same as now
However, I'd strongly recommend you consider using a streaming data library instead, as they are designed to work with this kind of situation and handle exceptions properly. For example, with conduit, your code would look something like:
parseFile :: MonadResource m => FilePath -> Producer m SomeData
parseFile path = bracketP
(openBinaryFile path ReadMode)
hClose
loop
where
loop handle = do
eof <- hIsEOF handle
if eof
then return ()
else parseFirst handle >>= yield >> loop handle
And if you instead rewrite your parseFirst
function to use conduit itself and not drop down to the Handle
API, this glue code would be shorter, and you wouldn't be tied directly to Handle
, which makes it easier to use other data sources and perform testing.
The conduit tutorial is available on the School of Haskell.
UPDATE One thing I forgot to mention is that, while the question focuses on exceptions preventing the file from being closed, even non-exceptional situations will result in that, if you don't completely consume the input. For example, if you file has more than one record, and you only force evaluation of the first one, the file will not be closed until the garbage collector is able to reclaim the handle. Yet another reason for either withBinaryFile
or a streaming data library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With