I have a comma delimited file with 12 columns.
There is problem with 5th and 6th columns (the text in 5th and 6th column is identical, but may have extra commas between them) which contains extra commas.
2011,123456,1234567,12345678,Hey There,How are you,Hey There,How are you,882864309037,ABC ABCD,LABACD,1.00000000,80.2500000,One Two
So in the above example "Hey There,How are you" should not have a comma.
I need to remove extra commas in 5th and 6th column.
If you always want to remove the 5th comma, try
sed 's/,//5' input.txt
But you are saying, it may have extra commas. You have to provide a logic how to find out if extra commas are there or not.
If you know the number of commas, you can use awk. This has proven to be quite an exercise, I am sure someone else will come up with a more elegant solution, but I'll share mine anyway:
awk -f script.awk input.txt
with script.awk:
BEGIN{
FS=","
}
NF<=12{
print $0
}
NF>12{
for (i=1; i<=4; i++) printf $i FS
for (j=0; j<2; j++){
for (i=0; i<=(NF-12)/2; i++){
printf $(i+5)
if (i<(NF-12)/2) printf "_"
else printf FS
}
}
for (i=NF-5; i<=NF; i++) printf $i FS
printf "n"
}
First we set the field separator to ,
. If we count less or equal to 12
fields, everything's fine and we simply print the whole line. If there are more than 12 fields, we print first the first 4 fields (again with the field separator), and then we print twice field 5 (and field 6), but instead of printing the ,
, we exchange it with _
. In the end we print the remaining fields.
As I said, there is probably a more elegant solution to this. I wonder with what other people come up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With