Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to match double quotes inside two double quotes?

Tags:

regex

ruby

I tried the following regex, but it matches all the double quotes:

(?>(?<=(")|))"(?(1)(?!"))

Here is a sample of the text:

"[\"my cars last night\",
\"Burger\",\"Decaf\" shirt\",
\"Mocha\",\"marshmallows\",
\"Coffee Mission\"]"

The pattern I want to match is the double quote between the double quotes in line 2

like image 296
0bserver07 Avatar asked Dec 25 '15 22:12

0bserver07


1 Answers

As a general rule, I would say: no.

Given a string:

\"Burger\" \"Decaf\" shirt\"

How do you decide which \" is superfluous (non-matching)? Is this one after Burger, one after Decaf or one after shirt? Or one before any of these words? I believe the choice is arbitrary.

Although in your particular example it seems that you want all \" that are not adjacent to comma.

These can be found by following regexp:

(?<!^)(?<![,\[])\\"(?![,\]])

We start with \\" (backslash followed by double quote) in the center.

Then we use negative lookahead to discard all matches that are followed by comma or closing square bracket.

Then we use negative lookbehind to discard all matches that happen after comma or opening bracket.

Regexp engine that I have used can't cope with alternation inside lookaround statements. To work around it, I take advantage of the fact that lookarounds are zero-length matches and I prepend negative lookbehind that matches beginning of line at the beginning of expression.

Proof (in perl):

$ cat test
"[\"my cars last night\",
\"Burger\",\"Decaf\" shirt\",
\"Mocha\",\"marshmallows\",
\"Coffee Mission\"]"
$ perl -n -e '$_ =~ s/(?<!^)(?<![,\[])\\"(?![,\]])/|||/g; print $_' test
"[\"my cars last night\",
\"Burger\",\"Decaf||| shirt\",
\"Mocha\",\"marshmallows\",
\"Coffee Mission\"]"
like image 134
Mirek Długosz Avatar answered Sep 19 '22 23:09

Mirek Długosz