Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string on commas but ignore commas within double-quotes using shell scripting in a .csv file?

Tags:

shell

csv

sed

awk

Sample 'null.csv' file contains

71131940,2015-05-01,"JEWELLERY,ITEM",P,,W

I have a .csv file in which I want to handle commas(,) and null values(,,) so that when I split each line of the file using (,) it ignores commas within double-quotes and does not give the output like this given below.

71131940,2015-05-01,JEWELLERY,ITEM,P,,W

I handled null values i.e (,,) by replaces it with (,0,) using sed command

sed -i -e "s/,,/,0,/g" null.csv

and got output something like

71131940,2015-05-01,JEWELLERY,ITEM,P,0,W

But the problem is that, in here I don't want to split "JEWELLERY,ITEM" into JEWELLERY,ITEM .

Any kind of help will be appreciated.

like image 409
shivam gupta Avatar asked Oct 26 '25 18:10

shivam gupta


1 Answers

I'm sure this has been asked and answered a million times but in any case, for input formatted as simply as you have shown (e.g. no quoted quotes or newlines within quotes):

$ awk -v FPAT='[^,]*|"[^"]*"' '{for (i=1;i<=NF;i++) print i, $i}' file
1 71131940
2 2015-05-01
3 "JEWELLERY,ITEM"
4 P
5
6 W

The above uses GNU awk for FPAT (see https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content).

like image 137
Ed Morton Avatar answered Oct 29 '25 09:10

Ed Morton



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!