Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inconsistent behavior of lines and words

Tags:

haskell

Here is a GHCi session:

Prelude> words " one two three"
["one","two","three"]
Prelude> lines "\none\ntwo\nthree"
["","one","two","three"]

Is there a reason for this inconsistency? And, if so, what is it?

like image 273
Ingo Avatar asked Aug 19 '16 21:08

Ingo


1 Answers

lines is an actual bijection: you can use it to split up any string at the '\n' characters, and later reassemble them perfectly with unlines. (Well, almost: let's disregard trailing newlines and Windows line ending.)

If words had the same behaviour just with ' ' instead of '\n' as the separator character, it wouldn't quite work the way we want it: for instance, the string

     "I will not buy this record\nit is scratched"

would get split up to

     ["I","will","not","buy","this","record\nit","is","scratched"]

which words avoids, by splitting at any whitespace.

Prelude> words "I will not buy this record\nit is scratched"
["I","will","not","buy","this","record","it","is","scratched"]

This means that a) it's not a bijection anyway, because the flavour of whitespace is lost, and b) you would get a lot of “empty words” when there are any two whitespace characters adjacent.

Hence, the sensible behaviour for words is to just condense such whitespace into a single gap.

like image 186
leftaroundabout Avatar answered Nov 15 '22 07:11

leftaroundabout