Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular expression in javascript string split, browser compatibility issue

I've been investigating this issue that only seems to get worse the more I dig deeper.

I started innocently enough trying to use this expression to split a string on HTML 'br' tags:

T = captions.innerHTML.split(/<br.*?>/g);

This works in every browser (FF, Safari, Chrome), except IE7 and IE8 with example input text like this:

is invariably subjective. <br /> 
The less frequently used warnings (Probably/Possibly) <br /> 

Please note that the example text contains a space before the '/', and precedes a new line.

Both of the following will match all HTML tags in every browser:

T = captions.innerHTML.split(/<.*?>/g);
T = captions.innerHTML.split(/<.+?>/g);

However, surprisingly (to me at least), this does not work in FF and Chrome:

T = captions.innerHTML.split(/<br.+?>/g);

Edit:

This (suggested several times in the responses below,) does not work on IE 7 or 8:

T = captions.innerHTML.split(/<br[^>]*>/g);

(It did work on Chrome and FF.)

My question is: does anyone know an expression that works in all current browsers to match the 'br' tags above (but not other HTML tags). And can anyone confirm that the last example above should be a valid match since two characters are present in the example text before the '>'.

PS - my doctype is HTML transitional.

Edit:

I think I have evidence this is specific to the string.split() behavior on IE, and not regex in general. You have to use split() to see this issue. I have also found a test matrix that shows a failure rate of about 30% for split() test cases when I ran it on IE. The same tests passed 100% on FF and Chrome:

http://stevenlevithan.com/demo/split.cfm

So far, I have still not found a solution for IE, and the library provided by the author of that test matrix did not fix this case.

like image 536
Walt Jones Avatar asked May 04 '09 22:05

Walt Jones


People also ask

Can I use regex in Split JavaScript?

You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.

Can split take multiple arguments JavaScript?

To split a string with multiple characters, you should pass a regular expression as an argument to the split() function. You can use [] to define a set of characters, as opposed to a single character, to match.

Does JavaScript support regex?

Using regular expressions in JavaScript. Regular expressions are used with the RegExp methods test() and exec() and with the String methods match() , replace() , search() , and split() . Executes a search for a match in a string. It returns an array of information or null on a mismatch.

What is ?: In regex?

It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.


2 Answers

The reason your code is not working is because IE parses the HTML and makes the tags uppercase when you read it through innerHTML. For example, if you have HTML like this:

<div id='box'>
Hello<br>
World
</div>

And then you use this Javascript (in IE):

alert(document.getElementById('box').innerHTML);

You will get an alert box with this:

Hello<BR>World

Notice the <BR> is now uppercase. To fix this, just add the i flag in addition to the g flag to make the regex be case-insensitive and it will work as you expect.

like image 152
Paolo Bergantino Avatar answered Oct 31 '22 00:10

Paolo Bergantino


Try this one:

/<br[^>]*>/gi
like image 38
Chad Birch Avatar answered Oct 31 '22 02:10

Chad Birch