Is it possible to skip a couple of characters in a capture group in regular expressions? I am using .NET regexes but that shouldn't matter.
Basically, what I am looking for is:
[random text]AB-123[random text]
and I need to capture 'AB123', without the hyphen.
I know that AB is 2 or 3 uppercase characters and 123 is 2 or 3 digits, but that's not the hard part. The hard part (at least for me) is skipping the hyphen.
I guess I could capture both separately and then concatenate them in code, but I wish I had a more elegant, regex-only solution.
Any suggestions?
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .
Parentheses group the regex between them. They capture the text matched by the regex inside them into a numbered group that can be reused with a numbered backreference. They allow you to apply regex operators to the entire grouped regex. (abc){3} matches abcabcabc.
In order to use a literal ^ at the start or a literal $ at the end of a regex, the character must be escaped. Some flavors only use ^ and $ as metacharacters when they are at the start or end of the regex respectively. In those flavors, no additional escaping is necessary. It's usually just best to escape them anyway.
In short: You can't. A match is always consecutive, even when it contains things as zero-width assertions there is no way around matching the next character if you want to get to the one after it.
There really isn't a way to create an expression such that the matched text is different than what is found in the source text. You will need to remove the hyphen in a separate step either by matching the first and second parts individually and concatenating the two groups:
match = Regex.Match( text, "([A-B]{2,3})-([0-9]{2,3})" ); matchedText = string.Format( "{0}{1}", match.Groups.Item(1).Value, match.Groups.Item(2).Value );
Or by removing the hyphen in a step separate from the matching process:
match = Regex.Match( text, "[A-B]{2,3}-[0-9]{2,3}" ); matchedText = match.Value.Replace( "-", "" );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With