I found some questions about this, but none of them really answered to my question.
I have a tabulated file like this:
2 10610 0 0 0 0.0105292
2 10649 0 0 0 0.041959
2 10682 0 0 0 0.0449746
2 10705 0 0 0 0.0441639
2 10797 2 0 0 0.0342728
2 10955 0 0 0 0.0136986
2 10957 0 0 0 0.0135135
2 11124 0 0 0 0.0583367
2 11336 1 0 0 0.0219502
and I used this command:
awk '{if ($6 > 0.4) print $6}' myfile
And here is the output:
0.0105292
0.041959
0.0449746
0.0441639
0.0342728
0.0136986
0.0135135
0.0583367
0.0219502
It's returning all the value for the 6th column. Here i should get no results since the condition is not respected. So I guess awk is not considering $6 as a float.
I tried other syntax but I still have the same problem.
I also tried the command on the first column and there it's working...
ps: I'm on MacOSX
Edit: Though it's working when I use awk '{print $6}'
It's your locale setting (see https://www.gnu.org/software/gawk/manual/gawk.html#Locales and specifically https://www.gnu.org/software/gawk/manual/gawk.html#Locale-influences-conversions), explicitly setting LC_ALL=C is one way to solve the problem:
LC_ALL=C awk '{if ($6 > 0.4) print $6}' myfile
What's happening is that you're trying to use a decimal point of .
but your locale (typical in most European countries and many others) uses ,
instead. So when your input contains:
0.0105292
awk does not recognize it as looking like a number in your locale, so instead it gets treated as a string. If your input was instead:
0,0105292
THEN awk would recognize it as a number (so this is the other way to solve your problem - use commas as the decimal point in your input).
So to awk your code:
$6 > 0.4
is a string "0.0105292"
being compared to a number 0.4
(per POSIX the .
is always the decimal point when used in the code) and per this comparison table from the gawk manual:
+----------------------------------------------
| STRING NUMERIC STRNUM
--------+----------------------------------------------
|
STRING | string string string
|
NUMERIC | string numeric numeric
|
STRNUM | string numeric numeric
--------+----------------------------------------------
we see that the type of comparison performed when a string is compared to a number (or anything else) is a string comparison.
So in your original code the string "0.0105292"
is being string-compared with the number 0.4
and awk is apparently deciding that the former is greater than the latter (idk why, maybe some other locale effect).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With