Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect Two Consecutive Single Quotes Inside Single Quotes

I'm struggling to get this regex pattern exactly right, and am open to other options outside of regex if someone has a better alternative.

The situation: I'm basically looking to parse a T-SQL "in" clause against a text column in C#. So, I need to take a string value like this: "'don''t', 'do', 'anything', 'stupid'"

And interpret that as a list of values (I'll take care of the double single quotes later):

  • "don''t"
  • "do"
  • "anything"
  • "stupid"

I have a regex that works for most cases, but I'm struggling to generalize it to the point where it will accept any character OR a doubled-up single quote inside my group: (?:')([a-z0-9\s(?:'(?='))]+)(?:')[,\w]*

I'm fairly experienced with regexes, but have rarely, if ever, found a need for look-arounds (so downgrade my assessment of my regex experience accordingly).

So, to put this another way, I'm wanting to take a string of comma-delimited values, each enclosed in single quotes but can contain doubled single quotes, and output each such value.

EDIT Here's a non-working example with my current regex (my problem is I need to handle all characters in my grouping and stop when I encounter a single quote not followed by a second single quote):

"'don''t', 'do?', 'anything!', '#stupid$'"

like image 410
Sven Grosen Avatar asked May 18 '15 15:05

Sven Grosen


People also ask

How do you escape a single quote from a single quote?

No escaping is used with single quotes. Use a double backslash as the escape character for backslash.

How do you escape a single quote in a double quote?

You need to escape a single quote when the literal is enclosed in a single code using the backslash(\) or need to escape double quotes when the literal is enclosed in a double code using a backslash(\).

How do you match double quotes in regex?

Firstly, double quote character is nothing special in regex - it's just another character, so it doesn't need escaping from the perspective of regex. However, because Java uses double quotes to delimit String constants, if you want to create a string in Java with a double quote in it, you must escape them.

What is a double quote string?

The basic double-quoted string is a series of characters surrounded by double quotes. If you need to use the double quote inside the string, you can use the backslash character. This literal is similar to one you've already seen. Just the quotes are different.


1 Answers

If you still think about a regex-based solution, you can use the following regex:

'(?:''|[^'])*'

Or an "un-rolled" version suggested by @sln:

'[^']*(?:''[^']*)*'

See demo

It is fairly simple, it captures double single quotation marks OR anything that is not a single quotation mark. No need using any look-behinds or look-aheads. It does not take care of any escaped entities, but I do not see this requirement in your question.

Moreover, this regex will return matches that are easy to access and deal with:

var text = "'don''t', 'do', 'anything', 'stupid'";
var re = new Regex(@"'[^']*(?:''[^']*)*'"); // Updated thanks to @sln, previous (@"'(?:''|[^'])*'");
var match_values = re.Matches(text).Cast<Match>().Select(p => p.Value).ToList();

Output:

enter image description here

like image 133
Wiktor Stribiżew Avatar answered Sep 29 '22 15:09

Wiktor Stribiżew