I've been investigating this issue that only seems to get worse the more I dig deeper.
I started innocently enough trying to use this expression to split a string on HTML 'br' tags:
T = captions.innerHTML.split(/<br.*?>/g);
This works in every browser (FF, Safari, Chrome), except IE7 and IE8 with example input text like this:
is invariably subjective. <br />
The less frequently used warnings (Probably/Possibly) <br />
Please note that the example text contains a space before the '/', and precedes a new line.
Both of the following will match all HTML tags in every browser:
T = captions.innerHTML.split(/<.*?>/g);
T = captions.innerHTML.split(/<.+?>/g);
However, surprisingly (to me at least), this does not work in FF and Chrome:
T = captions.innerHTML.split(/<br.+?>/g);
Edit:
This (suggested several times in the responses below,) does not work on IE 7 or 8:
T = captions.innerHTML.split(/<br[^>]*>/g);
(It did work on Chrome and FF.)
My question is: does anyone know an expression that works in all current browsers to match the 'br' tags above (but not other HTML tags). And can anyone confirm that the last example above should be a valid match since two characters are present in the example text before the '>'.
PS - my doctype is HTML transitional.
Edit:
I think I have evidence this is specific to the string.split() behavior on IE, and not regex in general. You have to use split() to see this issue. I have also found a test matrix that shows a failure rate of about 30% for split() test cases when I ran it on IE. The same tests passed 100% on FF and Chrome:
http://stevenlevithan.com/demo/split.cfm
So far, I have still not found a solution for IE, and the library provided by the author of that test matrix did not fix this case.
You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.
To split a string with multiple characters, you should pass a regular expression as an argument to the split() function. You can use [] to define a set of characters, as opposed to a single character, to match.
Using regular expressions in JavaScript. Regular expressions are used with the RegExp methods test() and exec() and with the String methods match() , replace() , search() , and split() . Executes a search for a match in a string. It returns an array of information or null on a mismatch.
It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.
The reason your code is not working is because IE parses the HTML and makes the tags uppercase when you read it through innerHTML. For example, if you have HTML like this:
<div id='box'>
Hello<br>
World
</div>
And then you use this Javascript (in IE):
alert(document.getElementById('box').innerHTML);
You will get an alert box with this:
Hello<BR>World
Notice the <BR>
is now uppercase. To fix this, just add the i
flag in addition to the g
flag to make the regex be case-insensitive and it will work as you expect.
Try this one:
/<br[^>]*>/gi
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With