I have a .csv file with a header row like so;
headerA,headerB,headerC
bill,jones,p
mike,smith,f
sally,silly,p
I'd like to filter out any records with the f value in the headerC column.
Can I do that with sed or awk?
If header does not contains only f
at the third columns name:
sed '/,f$/d' FILE
will do (deletes every line from the input if it ends with ,f
).
If it has, I'd go with:
sed -n -e '1p;/,[^f]$/p' FILE
(Does not print anything by default (-n
) but the 1st line must 1p
, and if the lines are ends with other char than f
... Note: this will not work, if the 3rd columnc contains more than one char.)
And an awk
one:
awk -F, 'NF == 1 ; NF > 1 && $3 != "f"' FILE
(This always prints the first line (NF == 1
is true, then default action, which is print $0
, then the next condtitions are checking if we had got over the 1st line, and the 3rd field is not f
then default action...)
HTH
well, if you know that headerC
is always in the third column, the following sed command would work:
sed -r '/[^,]+(,[^,]+){1},f/ d' < file.csv > filefiltered.csv
And the following awk command does the same:
awk 'BEGIN {FS=","} {if($3 != "f") print}' file.csv
If you don't know headerC
is always in a particular column it gets a little more tricky. Does this work?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With