Lets say I have the following text and I want to extract the text between "Start of numbers" and "End of numbers" there are dynamic amount of lines and the only thing which changes in the numbers in them eg: first, second, etc. Each file I'll be extracting data from has different amount of lines between between "Start of numbers" and "End of numbers". How can I write a regex to match the content between "Start of numbers" and "End of numbers" without knowing how many lines will be in the file between Start of numbers" and "End of numbers"?
Regards!
This is the first line This is the second line
Start of numbers
This is the first line
This is the second line
This is the third line
This is the ...... line
This is the ninth line
End of numbers
Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.
If you want to indicate a line break when you construct your RegEx, use the sequence “\r\n”. Whether or not you will have line breaks in your expression depends on what you are trying to match. Line breaks can be useful “anchors” that define where some pattern occurs in relation to the beginning or end of a line.
By default in most regex engines, . doesn't match newline characters, so the matching stops at the end of each logical line. If you want . to match really everything, including newlines, you need to enable “dot-matches-all” mode in your regex engine of choice (for example, add re. DOTALL flag in Python, or /s in PCRE.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
You should use the SingleLine
mode which tells your C# regular expression that .
matches any character (not any character except \n
).
var regex = new Regex("Start of numbers(.*)End of numbers",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
You should be able to match multi-line strings without issue. Just remember to add the right characters in (\n
for new lines).
string pattern = "Start of numbers(.|\n)*End of numbers";
Match m = Regex.Matches(input, pattern);
This is easier if you can think of your string with the hidden characters.
Start of numbers\n\nThis is the first line\nThis is the second line\n ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With