I want to check if a certain pattern (eg. a double quoted string) matches at an exact position.
Example
string text = "aaabbb";
Regex regex = new Regex("b+");
// Now match regex at exactly char 3 (offset) of text
I'd like to check if regex
matches at exactly char 3.
I had a look at the Regex.Match Method (String, Int32)
but it does not behave like I expected.
So I did some tests and some workarounds:
public void RegexTest2()
{
Match m;
string text = "aaabbb";
int offset = 3;
m = new Regex("^a+").Match(text, 0); // lets do a sanity check first
Assert.AreEqual(true, m.Success);
Assert.AreEqual("aaa", m.Value); // works as expected
m = new Regex("^b+").Match(text, offset);
Assert.AreEqual(false, m.Success); // this is quite strange...
m = new Regex("^.{"+offset+"}(b+)").Match(text); // works, but is not very 'nice'
Assert.AreEqual(true, m.Success);
Assert.AreEqual("bbb", m.Groups[1].Value);
m = new Regex("^b+").Match(text.Substring(offset)); // works too, but
Assert.AreEqual(true, m.Success);
Assert.AreEqual("bbb", m.Value);
}
In fact I'm starting to believe that new Regex("^.", 1).Match(myString)
will never match anything.
Any suggestions?
Edit:
I got a working solution (workaround). So my question is all about speed and a nice implementation.
Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.
(? i) makes the regex case insensitive. (? c) makes the regex case sensitive.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
\\. matches the literal character . . the first backslash is interpreted as an escape character by the Emacs string reader, which combined with the second backslash, inserts a literal backslash character into the string being read. the regular expression engine receives the string \. html?\ ' .
Have you tried what the docs say?
If you want to restrict a match so that it begins at a particular character position in the string and the regular expression engine does not scan the remainder of the string for a match, anchor the regular expression with a \G (at the left for a left-to-right pattern, or at the right for a right-to-left pattern). This restricts the match so it must start exactly at startat.
i.e. replace the ^
with a \G
:
m = new Regex(@"\\Gb+").Match(text, offset);
Assert.AreEqual(true, m.Success); // should now work
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With