This is an extension to a related question answered Here
I have a weekly csv file which needs to be parsed. it looks like this.
"asdf","asdf","asdf","asdf"
But sometimes there are text fields which contain an extra unescaped double quote string like this
"asdf","as "something" df","asdf","asdf"
From the other posts on here, I was able to put together a regex
(?m)""(?![ \t]*(,|$))
which matches two successive double quotes, only "if they DON'T have a comma or end-of-the-line ahead of them with optionally spaces and tabs in between"
now this finds only double quotes in succession. How do i modify it to find and replace/delete the double quotes around "something" in the file?
thanks.
(?<!^|,)"(?!,|$)
will match a double quote that is not preceded or followed by a comma nor situated at start/end of line.
If you need to allow whitespace around the commas or at start/end-of-line, and if your regex flavor (which you didn't specify) allows arbitrary-length lookbehind (.NET does, for example), you can use
(?<!^\s*|,\s*)"(?!\s*,|\s*$)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With