I have a c# function that finds patters of text in side an input and does some processing. (I am using 3.5 version of the .net framework)
public void func(string s)
{
Regex r = new Regex("^\s*Pattern\s*$", RegexOptions.Multiline | RegexOptions.ExplicitCapture );
Match m = r.Match(s);
//Do something with m
}
A use of the function might look like this
string s = "Pattern \n Pattern \n non-Pattern";
func(s);
However, I am finding that sometimes my input is looking more like this
string s = "Pattern \r Pattern \r non-Pattern"
func(s);
And it is not being matched. Is there a way to have \r
be treated like a \n
within the regex? I figure I could always just replace all \r
s with \n
s, but I was hoping I could minimize operations if I could just get the regex do it all at once.
historically a \n was used to move the carriage down, while the \r was used to move the carriage back to the left side of the page.
'\r' is the carriage return character.
\n (New line) – We use it to shift the cursor control to the new line. \t (Horizontal tab) – We use it to shift the cursor to a couple of spaces to the right in the same line.
\0 is the null byte, used to terminate strings. \n is the newline character, 10 in ASCII, used (on Unix) to separate lines.
Unfortunatly, when I have run in to similar situations the only situation I found that works is I just do two passes with the regex (like you where hoping to avoid), the first one normalizes the line endings then the 2nd one can do the search as normal, there is no way to get Multiline
to trigger on just /r
that I could find.
public void func(string s)
{
s = Regex.Replace(s, @"(\r\n|\n\r|\n|\r)", "\r\n");
Regex r = new Regex("^\s*Pattern\s*$", RegexOptions.Multiline | RegexOptions.ExplicitCapture );
Match m = r.Match(s);
//Do something with m
}
According to the documentation Anchors in Regular Expression:
^
in Multiline
mode will match the beginning of input string, or the start of the line (as defined by \n
).$
in Multiline
mode will match the end of input string, or just before \n
.If your purpose is to redefine the anchors to define a line with both \r
and \n
, then you have to simulate it with look-ahead and look-behind.
^
should be simulated with (?<=\A|[\r\n])
$
should be simulated with (?=\Z|[\r\n])
Note that the simulation above will consider \r\n
to have 3 starts of line and 3 ends of line. 1 start of line and 1 end of line are defined by start and end of the string. The other 2 starts of line and 2 ends of line are defined by \r
and \n
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With