Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the regex to match anything except a double quote not preceded by a backslash?

Tags:

In other words, I have a string like:

"anything, escaped double-quotes: \", yep" anything here NOT to be matched.

How do I match everything inside the quotes?

I'm thinking

^"((?<!\\)[^"]+)"

But my head spins, should that be a positive or a negative lookbehind? Or does it work at all?

How do I match any characters except a double-quote NOT preceded by a backslash?

like image 588
Core Xii Avatar asked Aug 29 '09 18:08

Core Xii


People also ask

How do you replace double quotes in regex?

To remove double quotes just from the beginning and end of the String, we can use a more specific regular expression: String result = input. replaceAll("^\"|\"$", ""); After executing this example, occurrences of double quotes at the beginning or at end of the String will be replaced by empty strings.

How do you match a character except one regex?

To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself. The character '. ' (period) is a metacharacter (it sometimes has a special meaning).

What do Backslashes mean in regex?

\ The backslash suppresses the special meaning of the character it precedes, and turns it into an ordinary character. To insert a backslash into your regular expression pattern, use a double backslash ('\\').

Do I need to escape quotes in regex?

In order to use a literal ^ at the start or a literal $ at the end of a regex, the character must be escaped. Some flavors only use ^ and $ as metacharacters when they are at the start or end of the regex respectively. In those flavors, no additional escaping is necessary. It's usually just best to escape them anyway.


2 Answers

No lookbehind necessary:

"([^"]|\\")*" 

So: match quotes, and inside them: every character except a quote ([^"]) or an escaped quote (\\"), arbitrarily many times (*).

like image 107
Konrad Rudolph Avatar answered Sep 17 '22 19:09

Konrad Rudolph


"Not preceded by" translates directly to "negative lookbehind", so you'd want (?<!\\)".

Though here's a question that may ruin your day: what about the string "foo\\"? That is, a double-quote preceded by two backslashes, where in most escaping syntaxes we would be wanting to negate the special meaning of the second backslash by preceding it with the first.

That sort of thing is kind of why regexes aren't a substitute for parsers.

like image 40
chaos Avatar answered Sep 20 '22 19:09

chaos