Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the UNIX sort utility ignore leading spaces without the option -b?

Tags:

unix

[This is the rewrite of a similar question I asked backwards... Sorry for the confusion!]

I'm confused about leading s and the standard sort utility. Consider the contents of myfile:

a
 b
  a

Executing sort -t : myfile yields an unexpected result, at least to me:

a
  a
 b

Does that make sense? <space> should come either before a-z (as is the case in ASCII), or after. In the first case I would expect

  a
 b
a

while in the second case

a
 b
  a

Why, then, does sort seem to apply the -b option (ignore leading s) if when it wasn't included? In fact, to be safe I added the -t option in order to have exactly one field in each line. (According to the POSIX standard, "A field comprises a maximal sequence of non-separating characters and, in the absence of option -t, any preceding field separator." sort myfile yields the same output, which is also unexpected.)

Thanks in advance!

like image 553
ezequiel-garzon Avatar asked Aug 23 '11 23:08

ezequiel-garzon


2 Answers

It depends on the locale. With

LC_COLLATE=en_US.utf8 sort myfile

I get your unexpected result, and with

LC_COLLATE=C sort myfile

I get your expected result. Also see bash sort unusual order. Problem with spaces?

(I don't know why sort handles -b and -t like this.)

like image 170
David Andersson Avatar answered Oct 11 '22 23:10

David Andersson


$ sort -t : foo
a
    a
  b
$ env LC_ALL=C sort -t: foo
    a
  b
a

From the man page : * WARNING * The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.

like image 22
Rob Parker Avatar answered Oct 11 '22 23:10

Rob Parker