I am trying to use awk to select/remove data based on cell entries in a CSV file.
How do I chain Awk commands to build up complex searches like I have done with grep? I plan to use Awk to select rows based on matching criteria in cells in multiple columns, not just the first column as in this example.
Test data
123,line1
123a,line2
abc,line3
G-123,line4
G-123a,line5
Separate Awk statements with intermediate files
awk '$1 !~ /^[[:digit:]]/ {print $0}' file.txt > output1.txt
awk '$1 !~ /^G-[[:digit:]]/ {print $0}' output1.txt > output2.txt
mv output2.txt output.txt
cat output.txt
Chained or multi-line grep version (I think limited to first column only)
grep -v \
-e "^[[:digit:]]" \
-e "^G-[[:digit:]]" \
file.txt > output.txt
cat output.txt
How can I rewrite the Awk command to avoid the intermediate files?
Generally, in awk there are boolean operators available (it's better than grep! :) )
awk '/match1/ || /match2/' file
awk '(/match1/ || /match2/ ) && /match3/' file
and so on ...
In your example you could use something like:
awk -F, '$1 ~ /^[[:digit:]]/ || $1 ~ /G-[[:digit:]]/' input >> output
Note: This is just an example of how to use boolean operators. Also the regular expression itself could have been used here to express the alternative match:
awk -F, '$1 ~ /^(G-)?[[:digit:]]/' input >> ouput
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With