Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter out values less than a threshold from a CSV file

Tags:

csv

awk

My CSV file is tab delimited, and I am trying to filter out the p values that are greater than 0.05 (in another way, I want to keep the entries with p <= 0.05). The p values are in the 7th column, and I tried to use the following:

 awk '$7 <= 0.05 {print $0}' rawFile.csv > filteredFile.csv

But this filtering does not work, it returns the file without filtering.

The p-values in the column #7 are something like this: 0.33532935, 0.0, 0.8591287

like image 207
TonyGW Avatar asked Nov 18 '13 17:11

TonyGW


1 Answers

Try this one:

awk 'BEGIN {FS='\t'} {if ($7 < 0.05) print $0}'

The BEGIN clause gives you a place to change the default Field Separator (FS) to a tab character ('\t'). This won't work in older versions of awk (and gawk might be a helpful alternative).

The main logic happens inside the second set of curly braces ... where you'll say to print the line if column 7 is <= 0.05.

like image 126
mclaugpa Avatar answered Sep 16 '22 21:09

mclaugpa