AWK to filter CSV files

Tags:

2 Answers

If your fileCSV.csv has columns separated by , than you need to

awk -F, '$3 !~ /^synonymous/' fileCSV.csv > fileCSV2.csv

If -F does not work with your version of awk try

awk 'BEGIN{FS=","} $3 !~ /^synonymous/' fileCSV.csv > fileCSV2.csv

EDIT: you also need to take " into account, so use /^"synonymous/

answered Oct 18 '22 13:10

rzymek

To process csv file using awk I would prefer the following method to automatically account for quotation marks, namely preprocess with sed.

For your concrete question I would use

sed -e 's/^"//;s/"$//' fileCSV.csv | awk -F '"?,"?' '$3 !~ /^synonymous/'

If you also want to correctly process files with string fields containing quotation marks (which will be represented by double quotation marks in csv files), you need to change the sed expression the following way,

sed -e 's/^"//;s/"$//;s/""/"/g' fileCSV.csv | awk -F '"?,"?' '$3 !~ /^synonymous/'

This method has the advantage that it allows you to correctly print or process some fields using awk. For example if you want to print the first and fifth field from the filtered lines, seperated by a : you can now use

sed -e 's/^"//;s/"$//;s/""/"/g' fileCSV.csv | awk -F '"?,"?' '$3 !~ /^synonymous/ { print $1,":",$5}'

(If the difference between the methods is not clear to you, you can try the last awk command without the sed preprocessing)

answered Oct 18 '22 15:10

alphanum

Related questions
                            
                                Python pandas load csv ANSI Format as UTF-8
                            
                                php fputcsv and enclosing fields
                            
                                Efficient way to import a lot of csv files into PostgreSQL db
                            
                                Parse CSV file in C [closed]
                            
                                How do I read csv with large numbers (probably scientific notation) in R?
                            
                                The fastest way to parse dates in Python when reading .csv file?
                            
                                Import-Csv - Member already present issue
                            
                                How to convert json into csv file using jq?
                            
                                Does python's csv.reader read the entire file into memory?
                            
                                Sum a csv column in python
                            
                                Translating Magento frontend
                            
                                PHP "str_replace" doesn't work properly in some case?
                            
                                load csv file to numpy and access columns by name
                            
                                Python: searching csv and return entire row
                            
                                SQLite database larger than CSV flat file?
                            
                                How to write in .csv file from a generator in python
                            
                                Print OLS regression summary to text file
                            
                                Reading csv files in chunks with `readr::read_csv_chunked()`
                            
                                Download multiple csv files with one button (downloadhandler) with R Shiny
                            
                                Merge CSV files into a single file with no repeated headers

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

AWK to filter CSV files

Tags:

csv

awk

TonyGW

People also ask

2 Answers

rzymek

alphanum

Recent Activity

Donate For Us