The following regular expression will match "Saturday" or "Sunday" : (?:(Sat)ur|(Sun))day
But in one case backreference 1 is filled while backreference 2 is empty and in the other case vice-versa.
PHP (pcre) provides a nice operator "?|" that circumvents this problem. The previous regex would become (?|(Sat)ur|(Sun))day
. So there will not be empty backreferences.
Is there an equivalent in C# or some workaround ?
As this manual page says, you need PHP 5.1.0 and the /u modifier in order to enable these features, but that isn't the only requirement! It is possible to install later versions of PHP (we have 5.1.4) while linking to an older PCRE install.
In Perl, PCRE, and Boost, it is best to use a branch reset group when you want groups in different alternatives to have the same name. That’s the only way in Perl, PCRE, and Boost to make sure that groups with the same name really are one and the same group.
The reset operation is not represented in the flowchart, because it is an interrupt, and therefore may occur at any time within the loop. The program name, MOT1, is placed in the start terminal symbol. Most programs need some form of initialization process, such as setting up the ports at the beginning of the main program loop.
Perl 5.10 introduced a new regular expression feature called a branch reset group. JGsoft V2 and PCRE 7.2 and later also support this, as do languages like PHP, Delphi, and R that have regex functions based on PCRE. Boost added them to its ECMAScript grammar in version 1.42. Alternatives inside a branch reset group share the same capturing groups.
should be possible to concat backref1 and backref2.
As one of each is always empty and a string concat with empty is still the same string...
with your regex (?:(Sat)ur|(Sun))day
and replacement $1$2
you get Sat
for Saturday
and Sun
for Sunday
.
regex (?:(Sat)ur|(Sun))day input | backref1 _$1_ | backref2 _$2_ | 'concat' _$1$2_ ---------|---------------|---------------|---------------- Saturday | 'Sat' | '' | 'Sat'+'' = Sat Sunday | '' | 'Sun' | ''+'Sun' = Sun
instead of reading backref1 or backref2 just read both results and concat the result.
.NET doesn't support the branch-reset operator, but it does support named groups, and it lets you reuse group names without restriction (something no other flavor does, AFAIK). So you could use this:
(?:(?<abbr>Sat)ur|(?<abbr>Sun))day
...and the abbreviated name will be stored in Match.Groups["abbr"]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With