Why is [Char]-based input so much slower than the [Char]-based output in Haskell?

Question

It is a common knowledge that one does not use [Char] to read large amounts of data in Haskell. One uses ByteStrings to do the job. The usual explanation for this is that Chars are large and lists add their overhead.

However, this does not seem to cause any problems with the output.

For example the following program:

main = interact $ const $ unwords $ map show $ replicate 500000 38000000

takes just 131 ms to run on my computer, while the following one:

import Data.List

sum' :: [Int] -> Int
sum' = foldl' (+) 0

main = interact $ show . sum' . map read . words

takes 3.38 seconds if fed the output of the first program as an input!

What is the reason for such a disparity between the input and output performance using Strings?

Judah Jacobson · Accepted Answer

I don't think that this issue necessarily has to do with I/O. Rather, it demonstrates that the Read instance for Int is pretty inefficient.

First, consider the following program which just processes a lazy list. It takes 4.1s on my machine (compiled with -O2):

main = print $ sum' $ map read $ words
        $ unwords $ map show $ replicate 500000 38000000

Replacing the read function with length drops the time down to 0.48s:

main = print $ sum' $ map length $ words
        $ unwords $ map show $ replicate 500000 38000000

Furthermore, replacing the read function with a handwritten version results in a time of 0.52s:

main = print $ sum' $ map myread $ words
        $ unwords $ map show $ replicate 500000 38000000

myread :: String -> Int
myread = loop 0
  where
    loop n [] = n
    loop n (d:ds) = let d' = fromEnum d  - fromEnum '0' :: Int
                        n' = 10 * n + d'
                    in loop n' ds

My guess as to why read is so inefficient is that its implementation uses the Text.ParserCombinators.ReadP module, which may not be the fastest choice for the simple case of reading a single integer.

Why is [Char]-based input so much slower than the [Char]-based output in Haskell?

Tags:

performance

string

io

haskell

Rotsor

1 Answers

Judah Jacobson

Recent Activity

Donate For Us

Why is [Char]-based input so much slower than the [Char]-based output in Haskell?

Tags:

performance

string

io

haskell

Rotsor

1 Answers

Judah Jacobson

Related questions

Recent Activity

Donate For Us