I'm trying to write a regex that will match everything BUT an apostrophe that has not been escaped. Consider the following:
<?php $s = 'Hi everyone, we\'re ready now.'; ?>
My goal is to write a regular expression that will essentially match the string portion of that. I'm thinking of something such as
/.*'([^']).*/
in order to match a simple string, but I've been trying to figure out how to get a negative lookbehind working on that apostrophe to ensure that it is not preceded by a backslash...
Any ideas?
- JMT
Here's my solution with test cases:
/.*?'((?:\\\\|\\'|[^'])*+)'/
And my (Perl, but I don't use any Perl-specific features I don't think) proof:
use strict;
use warnings;
my %tests = ();
$tests{'Case 1'} = <<'EOF';
$var = 'My string';
EOF
$tests{'Case 2'} = <<'EOF';
$var = 'My string has it\'s challenges';
EOF
$tests{'Case 3'} = <<'EOF';
$var = 'My string ends with a backslash\\';
EOF
foreach my $key (sort (keys %tests)) {
print "$key...\n";
if ($tests{$key} =~ m/.*?'((?:\\\\|\\'|[^'])*+)'/) {
print " ... '$1'\n";
} else {
print " ... NO MATCH\n";
}
}
Running this shows:
$ perl a.pl
Case 1...
... 'My string'
Case 2...
... 'My string has it\'s challenges'
Case 3...
... 'My string ends with a backslash\\'
Note that the initial wildcard at the start needs to be non-greedy. Then I use non-backtracking matches to gobble up \\ and \' and then anything else that is not a standalone quote character.
I think this one probably mimics the compiler's built-in approach, which should make it pretty bullet-proof.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With