Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make awk ignore the field delimiter inside double quotes? [duplicate]

Tags:

bash

shell

awk

I need to delete 2 columns in a comma seperated values file. Consider the following line in the csv file:

"[email protected],www.example.com",field2,field3,field4 "[email protected]",field2,field3,field4 

Now, the result I want at the end:

"[email protected],www.example.com",field4 "[email protected]",field4 

I used the following command:

awk 'BEGIN{FS=OFS=","}{print $1,$4}' 

But the embedded comma which is inside quotes is creating a problem, Following is the result I am getting:

"[email protected],field3 "[email protected]",field4 

Now my question is how do I make awk ignore the "," which are inside the double quotes?

like image 228
Deepak K M Avatar asked Apr 15 '15 05:04

Deepak K M


People also ask

What is FPAT in awk?

For a CSV file the FPAT value is: FPAT = "([^,]+)|(\"[^\"]+\")" Using the data: abc,"pqr,mno" The first grouped expression evaluates to everything i.e. not a comma, this should take "abc" as data then fail for the first occurrence of comma.

How do you use awk without quotations?

To remove the ' from the awk output you can use sed "s/^'//;s/'$//" This command removes the ' only at the beginning and the end of the output line and is not so heavy as to use awk and not so general if using tr.


1 Answers

From the GNU awk manual (http://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content):

$ awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, '{print $1,$4}' file "[email protected],www.example.com",field4 "[email protected]",field4 

and see What's the most robust way to efficiently parse CSV using awk? for more generally parsing CSVs that include newlines, etc. within fields.

like image 189
Ed Morton Avatar answered Oct 12 '22 04:10

Ed Morton