Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression Match variable multiple lines? [duplicate]

Tags:

c#

regex

Lets say I have the following text and I want to extract the text between "Start of numbers" and "End of numbers" there are dynamic amount of lines and the only thing which changes in the numbers in them eg: first, second, etc. Each file I'll be extracting data from has different amount of lines between between "Start of numbers" and "End of numbers". How can I write a regex to match the content between "Start of numbers" and "End of numbers" without knowing how many lines will be in the file between Start of numbers" and "End of numbers"?

Regards!

This is the first line This is the second line

Start of numbers

This is the first line
This is the second line
This is the third line
This is the ...... line
This is the ninth line

End of numbers
like image 751
Arya Avatar asked Apr 24 '12 05:04

Arya


People also ask

What is multiline in regex?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

How do I enable line breaks in regex?

If you want to indicate a line break when you construct your RegEx, use the sequence “\r\n”. Whether or not you will have line breaks in your expression depends on what you are trying to match. Line breaks can be useful “anchors” that define where some pattern occurs in relation to the beginning or end of a line.

How do you match a character including newline in regex?

By default in most regex engines, . doesn't match newline characters, so the matching stops at the end of each logical line. If you want . to match really everything, including newlines, you need to enable “dot-matches-all” mode in your regex engine of choice (for example, add re. DOTALL flag in Python, or /s in PCRE.

How do you match in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).


2 Answers

You should use the SingleLine mode which tells your C# regular expression that . matches any character (not any character except \n).

var regex = new Regex("Start of numbers(.*)End of numbers",
                  RegexOptions.IgnoreCase | RegexOptions.Singleline);
like image 81
Paul Oliver Avatar answered Oct 23 '22 18:10

Paul Oliver


You should be able to match multi-line strings without issue. Just remember to add the right characters in (\n for new lines).

string pattern = "Start of numbers(.|\n)*End of numbers";
Match m = Regex.Matches(input, pattern);

This is easier if you can think of your string with the hidden characters.

Start of numbers\n\nThis is the first line\nThis is the second line\n ...
like image 24
David Z. Avatar answered Oct 23 '22 16:10

David Z.