Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use JavaScript regex over multiple lines?

var ss= "<pre>aaaa\nbbb\nccc</pre>ddd"; var arr= ss.match( /<pre.*?<\/pre>/gm ); alert(arr);     // null 

I'd want the PRE block be picked up, even though it spans over newline characters. I thought the 'm' flag does it. Does not.

Found the answer here before posting. SInce I thought I knew JavaScript (read three books, worked hours) and there wasn't an existing solution at SO, I'll dare to post anyways. throw stones here

So the solution is:

var ss= "<pre>aaaa\nbbb\nccc</pre>ddd"; var arr= ss.match( /<pre[\s\S]*?<\/pre>/gm ); alert(arr);     // <pre>...</pre> :) 

Does anyone have a less cryptic way?

Edit: this is a duplicate but since it's harder to find than mine, I don't remove.

It proposes [^] as a "multiline dot". What I still don't understand is why [.\n] does not work. Guess this is one of the sad parts of JavaScript..

like image 478
akauppi Avatar asked Dec 30 '09 12:12

akauppi


People also ask

What is regex multiline mode?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

What is multiline flag in regex?

The " m " flag indicates that a multiline input string should be treated as multiple lines. For example, if " m " is used, " ^ " and " $ " change from matching at only the start or end of the entire string to the start or end of any line within the string. You cannot change this property directly.

Which regex is used to perform multiline matching?

The RegExp m Modifier in JavaScript is used to perform multiline matching.

What is \r and \n in regex?

\n. Matches a newline character. \r. Matches a carriage return character.


1 Answers

DON'T use (.|[\r\n]) instead of . for multiline matching.

DO use [\s\S] instead of . for multiline matching

Also, avoid greediness where not needed by using *? or +? quantifier instead of * or +. This can have a huge performance impact.

See the benchmark I have made: http://jsperf.com/javascript-multiline-regexp-workarounds

Using [^]: fastest Using [\s\S]: 0.83% slower Using (.|\r|\n): 96% slower Using (.|[\r\n]): 96% slower 

NB: You can also use [^] but it is deprecated in the below comment.

like image 166
KrisWebDev Avatar answered Oct 08 '22 08:10

KrisWebDev