Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed Haskell IO with buffering?

I read about IO buffering in the "Real World Haskell" (ch. 7, p. 189), and tried to test, how different buffering size affects the performance.

import System.IO
import Data.Time.Clock
import Data.Char(toUpper)

main :: IO ()
main = do
  hInp <- openFile "bigFile.txt" ReadMode
  let bufferSize = truncate $ 2**10
  hSetBuffering hInp (BlockBuffering (Just bufferSize))
  bufferMode <- hGetBuffering hInp
  putStrLn $ "Current buffering mode: " ++ (show bufferMode)

  startTime <- getCurrentTime
  inp <- hGetContents hInp
  writeFile "processed.txt" (map toUpper inp)
  hClose hInp
  finishTime <- getCurrentTime
  print $ diffUTCTime finishTime startTime
  return ()

Then I created a "bigFile.txt"

-rw-rw-r-- 1 user user 96M янв.  26 09:49 bigFile.txt

and run my program against this file, with different buffer size:

Current buffering mode: BlockBuffering (Just 32)
9.744967s   

Current buffering mode: BlockBuffering (Just 1024)
9.667924s                                      

Current buffering mode: BlockBuffering (Just 1048576)
9.494807s    

Current buffering mode: BlockBuffering (Just 1073741824)
9.792453s   

But the program running time is almost the same. Is it normal, or I'm doing something wrong?

like image 582
azaviruha Avatar asked Dec 20 '22 07:12

azaviruha


1 Answers

On a modern OS it is likely that the buffer size has little effect on reading a file linearly due to 1) read-ahead performed by the kernel and 2) the file might already be in the page cache if you have already read the file recently.

Here is a program which measures the effect of buffering on writes. Typical results are:

$ ./mkbigfile 32      -- 12.864733s
$ ./mkbigfile 64      --  9.668272s
$ ./mkbigfile 128     --  6.993664s
$ ./mkbigfile 512     --  4.130989s
$ ./mkbigfile 1024    --  3.536652s
$ ./mkbigfile 16384   --  3.055403s
$ ./mkbigfile 1000000 --  3.004879s

Source:

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString as BS
import Data.ByteString (ByteString)
import Control.Monad
import System.IO
import System.Environment
import Data.Time.Clock

main = do
  (arg:_) <- getArgs
  let size = read arg
  let bs = "abcdefghijklmnopqrstuvwxyz"
      n = 96000000 `div` (length bs)
  h <- openFile "bigFile.txt" WriteMode
  hSetBuffering h (BlockBuffering (Just size))
  startTime <- getCurrentTime
  replicateM_ n $ hPutStrLn h bs
  hClose h
  finishTime <- getCurrentTime
  print $ diffUTCTime finishTime startTime
  return ()
like image 74
ErikR Avatar answered Dec 25 '22 23:12

ErikR