Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression to match cross platform newline characters

My program can accept data that has newline characters of \n, \r\n or \r (eg Unix, PC or Mac styles)

What is the best way to construct a regular expression that will match whatever the encoding is?

Alternatively, I could use universal_newline support on input, but now I'm interested to see what the regex would be.

like image 510
Alan Avatar asked Aug 26 '09 00:08

Alan


People also ask

How do you match a character including newline in regex?

By default in most regex engines, . doesn't match newline characters, so the matching stops at the end of each logical line. If you want . to match really everything, including newlines, you need to enable "dot-matches-all" mode in your regex engine of choice (for example, add re. DOTALL flag in Python, or /s in PCRE.

How do I match any character across multiple lines in a regular expression?

So use \s\S, which will match ALL characters.

Does \s match new line?

According to regex101.com \s : Matches any space, tab or newline character.

What is regex multiline?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.


1 Answers

The regex I use when I want to be precise is "\r\n?|\n".

When I'm not concerned about consistency or empty lines, I use "[\r\n]+", I imagine it makes my programs somewhere in the order of 0.2% faster.

like image 141
too much php Avatar answered Sep 28 '22 05:09

too much php