In essence I wish to know how to approach implementing tail -F
Linux command functionality in Haskell. My goal is to follow a log file, such as a web server log file, and compute various real time statistics by parsing the input as it comes in. Ideally with no interruptions if the log file is rotated with logrotate
or similar service.
I'm somewhat at loss on how to even approach the problem and what should I take into consideration in terms of performance in presence of lazy I/O. Would any of the streaming libraries be relevant here?
This is a partial answer, as it doesn't handle file truncation by logrotate
. It avoids lazy I/O and uses the bytestring, streaming, streaming-bytestring and hinotify packages.
Some preliminary imports:
{-# language OverloadedStrings #-}
module Main where
import qualified Data.ByteString
import Data.ByteString.Lazy.Internal (defaultChunkSize)
import qualified Data.ByteString.Streaming as B
import Streaming
import qualified Streaming.Prelude as S
import Control.Concurrent.QSem
import System.INotify
import System.IO (withFile,IOMode(ReadMode))
import System.Environment (getArgs)
Here's the "tailing" function:
tailing :: FilePath -> (B.ByteString IO () -> IO r) -> IO r
tailing filepath continuation = withINotify $ \i -> do
sem <- newQSem 1
addWatch i [Modify] filepath (\_ -> signalQSem sem)
withFile filepath ReadMode (\h -> continuation (handleToStream sem h))
where
handleToStream sem h = B.concat . Streaming.repeats $ do
lift (waitQSem sem)
readWithoutClosing h
-- Can't use B.fromHandle here because annoyingly it closes handle on EOF
-- instead of just returning, and this causes problems on new appends.
readWithoutClosing h = do
c <- lift (Data.ByteString.hGetSome h defaultChunkSize)
if Data.ByteString.null c
then return ()
else do B.chunk c
readWithoutClosing h
It takes a file path an a callback that consumes a streaming bytestring.
The idea is that, each time before reading from the handle until EOF, we decrement a semaphore, which is only increased by the callback that is invoked when the file is modified.
We can test the function like this:
main :: IO ()
main = do
filepath : _ <- getArgs
tailing filepath B.stdout
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With