I'm seeing something strange with 'sort' in RedHat Enterprise Linux 5 x86_64 and in Ubuntu 9.1. I'm using bash.
First here's what I think is right to expect from sort using dictionary order:
[stauffer@unix-m sortTrouble]$ cat st1
1230
123
100
11
10
1
123
1230
100
[stauffer@unix-m sortTrouble]$ sort st1
1
10
100
100
11
123
123
1230
1230
[stauffer@unix-m sortTrouble]$
Now here's what happens when there's a second column (tab-delimited, even though it looks messy here):
[stauffer@unix-m sortTrouble]$ cat st2
1230 1
123 1
100 1
11 1
10 1
1 1
123 1
1230 1
100 1
[stauffer@unix-m sortTrouble]$ sort st2
100 1
100 1
10 1
1 1
11 1
1230 1
1230 1
123 1
123 1
Notice how the sort order for column 1 is different now. '11' gets put correctly after '1', but '10' and '100' do not. Similarly for '1230'. It seems like zero causes trouble.
This behavior is inconsistent, and it causes problems when using 'join' because it expects dictionary sorting.
On Mac OSX 10.5, the st2 file sorts like st1 in the first column.
Am I missing something, or is this a bug?
Thanks, Michael
from the man page
-b, --ignore-leading-blanks
ignore leading blanks
-g, --general-numeric-sort
compare according to general numerical value
-n, --numeric-sort
compare according to string numerical value
ex:
andrey@localhost:~/gamess$ echo -e "1\n2\n10" | sort
1
10
2
andrey@localhost:~/gamess$ echo -e "1\n2\n10" | sort -g
1
2
10
The sort can be performed the way you want by restricting the key to the column you're interested in:
sort -k1,1 inputfile
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With