Sorting pos/neg numbers with fractional parts using Unix sort

Question

Using sort (coreutils) 5.2.1

I have the following file, which I'd like to sort by the non-integer part of field 4. This can be a negative or positive number, and might also have the value INF.

field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=INF field5 field6

I would like this to be sorted as

field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

Given that the number part of the field is at character position 4 (assuming the indexing starts at 0, and I'm not sure of this), I have tried sort with the following options:

sort -g -k4.4 inputfile
sort -g -k4.5 inputfile
sort -n -k4.4 inputfile
sort -n -k4.5 inputfile
sort -g inputfile

These all yield the following, which is close, but not quite right. The magnitudes are sorted correctly, but I'd like the most negative value on top.

field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

How can I make sort behave?

FWIW, here's more information:

LANG = en_US.UTF-8
Red Hat Enterprise Linux WS release 4 (Nahant Update 6)

James Webster · Accepted Answer

I am on a Mac, so it may be a slightly different implementation, but I found this to work:

sort -gb -k 4.5,4 inputfile

In English: "sort, in a -general numeric fashion, ignoring -blanks, the file inputfile using the 4th -k(c)olumn's data, from the 5th element in that column to the end of the data in the 4th column"

field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

iruvar · Answer

You could add a pre-processing awk step that adds a new field at the end containing the numeric portion or the numeric representation from field 4, and sort by this field. Add a post-processing step to strip this field. Note that in the example below, INF has been set to an arbitrary high value of 10**10, you can set it to a higher value if you have a naturally occurring number in the input that exceeds this value

awk '{x=$4; sub("tag=", "", x); sub("INF", 10**10, x); print $0, x}' file.txt |
sort -k7,7g | 
cut -f-6 -d' '
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

Sorting pos/neg numbers with fractional parts using Unix sort

Tags:

linux

unix

sorting

tomocafe

2 Answers

James Webster

iruvar

Recent Activity

Donate For Us

Sorting pos/neg numbers with fractional parts using Unix sort

Tags:

linux

unix

sorting

tomocafe

2 Answers

James Webster

iruvar

Related questions

Recent Activity

Donate For Us