Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replacing doublequotes in csv

I've got nearly the following problem and didn't find the solution. This could be my CSV file structure:

1223;"B630521 ("L" fixed bracket)";"2" width";"length: 5"";2;alternate A
1224;"B630522 ("L" fixed bracket)";"3" width";"length: 6"";2;alternate B

As you can see there are some " written for inch and "L" in the enclosing ".

Now I'm looking for a UNIX shell script to replace the " (inch) and "L" double quotes with 2 single quotes, like the following example:

sed "s/$OLD/$NEW/g" $QFILE > $TFILE && mv $TFILE $QFILE

Can anyone help me?

like image 704
Karsten Avatar asked May 12 '11 14:05

Karsten


People also ask

How do I replace a double quote in a csv file?

If you want to remove all double quotes, simply select the entire worksheet, then go to the Editing group of the Home tab and click the Find and select drop down. Click on Replace (or hit CTRL + h). In the “Find what” field of the Dialog box type in a double quote.

How do I remove double quotes from CSV?

csv file. Use the Foreach-Object cmdlet (% is an alias) to read each line as it comes from the file. Inside the script block for the Foreach-Object command, use the $_ automatic variable to reference the current line and the replace operator to replace a quotation mark with nothing.

Why does my CSV have double quotes?

CSV files use double-quote marks to delimit field values that have spaces, so a value like Santa Claus gets saved as “Santa Claus” to preserve the space. If a field value contains double-quotes, then the double-quotes get doubled-up so the parser can tell the difference.


1 Answers

Update (Using perl it easy since you get full lookahead features)

perl -pe 's/(?<!^)(?<!;)"(?!(;|$))/'"'"'/g' file

Output

1223;"B630521 ('L' fixed bracket)";"2' width";"length: 5'";2;alternate A
1224;"B630522 ('L' fixed bracket)";"3' width";"length: 6'";2;alternate B

Using sed, grep only

Just by using grep, sed (and not perl, php, python etc) a not so elegant solution can be:

grep -o '[^;]*' file | sed  's/"/`/; s/"$/`/; s/"/'"'"'/g; s/`/"/g' 

Output - for your input file it gives:

1223
"B630521 ('L' fixed bracket)"
"2' width"
"length: 5'"
2
alternate A
1224
"B630522 ('L' fixed bracket)"
"3' width"
"length: 6'"
2
alternate B
  • grep -o is basically splitting the input by ;
  • sed first replaces " at start of line by `
  • then it replaces " at end of line by another `
  • it then replaces all remaining double quotes " by single quite '
  • finally it puts back all " at the start and end
like image 183
anubhava Avatar answered Sep 22 '22 15:09

anubhava