Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort scientific and float

I have been trying desperately to use the command sort, to sort a mixture out of scientific and floating values which are both positive and negative, e.g.:

-2.0e+00
2.0e+01
2.0e+02
-3.0e-02
3.0e-03
3.0e-02

Without the floating point or without the scientific exponent, it works just fine with sort -k1 -g file.dat. Using both at once as stated before, it results in:

-3.0e-02
-2.0e+00
2.0e+01
2.0e+02
3.0e-02
3.0e-03

This is obviously wrong since it should be:

-2.0e+00    
-3.0e-02
3.0e-03
3.0e-02
...

Any idea how I can solve this issue? And once I solve this, is there any possibility to sort the absolute value (e.g. get rid of the negative ones)? I know I could try to square each value, sort, take the square root. Doing this I would be less precise though and it would be neat to have a nice, fast and straightforward way.

My linux system: 8.12, Copyright © 2011

Thank you very much!

UPDATE: if I run it in the debug mode sort -k1 -g filename.dat --debug I get the following result (I translated it into english, output was german)

 sort: the sorting rules for „de_DE.UTF-8" are used
 sort: key 1 is numerically and involves several fields
-3.0e-02
__
________
-2.0e+00
__
________
2.0e+01
_
_______
2.0e+02
_
_______
3.0e-02
_
_______
3.0e-03
_
_______
like image 306
Jan Avatar asked Oct 13 '14 11:10

Jan


1 Answers

Based on comments under the question, this is a locale issue: sort is using a locale, which expects , as decimal separator, while your text has .. Ideal solution would to make sort use a different locale, and hopefully someone will write a correct answer covering that.

But, if you can't, or don't want to, change how sort works, then you can change the input it gets. This is easiest by making sort take its input from pipe, and modify it on the way. Here it is enough to change every . to ,, so the tool of choice is tr:

cat file.dat | tr . , | sort -k1 -g 

This solution has one big drawback: if command is executed with locale where sort uses . as decimal separator, then instead of fixing, this will break the sorting. So if you are writing a shell script, which may be used elsewhere, don't do this.

Important note: Above command has unnecessary use of cat. Everybody who wants themselves to be taken seriously as professional shell script programmers, don't do that!

like image 159
hyde Avatar answered Oct 31 '22 03:10

hyde