Using sort (coreutils) 5.2.1
I have the following file, which I'd like to sort by the non-integer part of field 4. This can be a negative or positive number, and might also have the value INF.
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=INF field5 field6
I would like this to be sorted as
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
Given that the number part of the field is at character position 4 (assuming the indexing starts at 0, and I'm not sure of this), I have tried sort with the following options:
sort -g -k4.4 inputfilesort -g -k4.5 inputfilesort -n -k4.4 inputfilesort -n -k4.5 inputfilesort -g inputfileThese all yield the following, which is close, but not quite right. The magnitudes are sorted correctly, but I'd like the most negative value on top.
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
How can I make sort behave?
FWIW, here's more information:
LANG = en_US.UTF-8
Red Hat Enterprise Linux WS release 4 (Nahant Update 6)
I am on a Mac, so it may be a slightly different implementation, but I found this to work:
sort -gb -k 4.5,4 inputfile
In English: "sort, in a -general numeric fashion, ignoring -blanks, the file inputfile using the 4th -k(c)olumn's data, from the 5th element in that column to the end of the data in the 4th column"
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
You could add a pre-processing awk step that adds a new field at the end containing the numeric portion or the numeric representation from field 4, and sort by this field. Add a post-processing step to strip this field. Note that in the example below, INF has been set to an arbitrary high value of 10**10, you can set it to a higher value if you have a naturally occurring number in the input that exceeds this value
awk '{x=$4; sub("tag=", "", x); sub("INF", 10**10, x); print $0, x}' file.txt |
sort -k7,7g |
cut -f-6 -d' '
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With