I have regex
(\p{P})\1
which successfully matches duplicate consecutive punctuation characters like
;;
,,
\\
, but i need to exclude 3 period (ellipsis) punctuation.
...
Be careful, as some approaches will not successfully match strings of the form .##
(i.e. a '.' before repeating punctuation). Assuming that is something that should match.
This solution satisfies the following requirements: -
.##
This is the regex:
(?>(\p{P})\1+)(?<!([^.]|^)\.{3})
Explanation:
?>
means atomic grouping. Specifically, throw away all backtracking positions. It means that if '...' fails to match, then don't step back and try and match '..'. (\p{P})\1+)
means match 2 or more punctuation characters - you already had this. (?<!([^.]|^)\.{3})
means search backwards from the end of the repeated character match and fail if you find three dots not preceded by a dot or beginning of string. This fails three dots while allowing two dots or four dots or more to work. The following test cases pass and illustrate use:
string pattern = @"(?>(\p{P})\1+)(?<!([^.]|^)\.{3})";
//Your examples:
Assert.IsTrue( Regex.IsMatch( @";;", pattern ) );
Assert.IsTrue( Regex.IsMatch( @",,", pattern ) );
Assert.IsTrue( Regex.IsMatch( @"\\", pattern ) );
//two and four dots should match
Assert.IsTrue( Regex.IsMatch( @"..", pattern ) );
Assert.IsTrue( Regex.IsMatch( @"....", pattern ) );
//Some success variations
Assert.IsTrue( Regex.IsMatch( @".;;", pattern ) );
Assert.IsTrue( Regex.IsMatch( @";;.", pattern ) );
Assert.IsTrue( Regex.IsMatch( @";;///", pattern ) );
Assert.IsTrue( Regex.IsMatch( @";;;...//", pattern ) ); //If you use Regex.Matches the matches contains ;;; and // but not ...
Assert.IsTrue( Regex.IsMatch( @"...;;;//", pattern ) ); //If you use Regex.Matches the matches contains ;;; and // but not ...
//Three dots should not match
Assert.IsFalse( Regex.IsMatch( @"...", pattern ) );
Assert.IsFalse( Regex.IsMatch( @"a...", pattern ) );
Assert.IsFalse( Regex.IsMatch( @";...;", pattern ) );
//Other tests
Assert.IsFalse( Regex.IsMatch( @".", pattern ) );
Assert.IsFalse( Regex.IsMatch( @";,;,;,;,", pattern ) ); //single punctuation does not match
Assert.IsTrue( Regex.IsMatch( @".;;.", pattern ) );
Assert.IsTrue( Regex.IsMatch( @"......", pattern ) );
Assert.IsTrue( Regex.IsMatch( @"a....a", pattern ) );
Assert.IsFalse( Regex.IsMatch( @"abcde", pattern ) );
To avoid matching ...
(?<![.])(?![.]{3})(\p{P})\1
(?<!\.)(?!\.{3}(?!\.))(\p{P})\1+
This will match any repeated punctuation (including ....
or ......
etc) unless it is the string ...
. For example:
; -- No Match
;; -- Match
,, -- Match
,,,, -- Match
\\ -- Match
... -- No Match
.... -- Match
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With