Using sort (coreutils) 5.2.1
I have the following file, which I'd like to sort by the non-integer part of field 4. This can be a negative or positive number, and might also have the value INF.
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=INF field5 field6
I would like this to be sorted as
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
Given that the number part of the field is at character position 4 (assuming the indexing starts at 0, and I'm not sure of this), I have tried sort
with the following options:
sort -g -k4.4 inputfile
sort -g -k4.5 inputfile
sort -n -k4.4 inputfile
sort -n -k4.5 inputfile
sort -g inputfile
These all yield the following, which is close, but not quite right. The magnitudes are sorted correctly, but I'd like the most negative value on top.
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
How can I make sort
behave?
FWIW, here's more information:
LANG = en_US.UTF-8
Red Hat Enterprise Linux WS release 4 (Nahant Update 6)
I am on a Mac, so it may be a slightly different implementation, but I found this to work:
sort -gb -k 4.5,4 inputfile
In English: "sort, in a -general numeric fashion, ignoring -blanks, the file inputfile using the 4th -k(c)olumn's data, from the 5th element in that column to the end of the data in the 4th column"
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
You could add a pre-processing awk step that adds a new field at the end containing the numeric portion or the numeric representation from field 4, and sort by this field. Add a post-processing step to strip this field. Note that in the example below, INF
has been set to an arbitrary high value of 10**10
, you can set it to a higher value if you have a naturally occurring number in the input that exceeds this value
awk '{x=$4; sub("tag=", "", x); sub("INF", 10**10, x); print $0, x}' file.txt |
sort -k7,7g |
cut -f-6 -d' '
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With