<p>i get some HTML it a as ajax response, and i need to get just the body contents. So i made this regex:</p> <pre class="prettyprint"><code>/(<body>|<\/body>)/ig </code></pre> <p>works well in all browser but for some reason IE gives me an other array when i use split:</p> <pre class="prettyprint"><code>data.split(/(<body>|<\/body>)/ig) </code></pre> <p>In all normal browsers the content of the body is <code>split(/(<body>|<\/body>)/ig)[2]</code> but in ie its in <code>split(/(<body>|<\/body>)/ig)[1]</code>. (tested in IE7 & 8)</p> <p>Why is this? And how could i modify it, in order to get the same array in all browsers?</p> <p><em><strong>edit</strong></em> just to clarify. I alrady have a solution as mentioned by tobyodavies. I want to understandy, why it behaves differently.</p> <p>this is the HTML from the response: (the string in data)</p> <pre class="prettyprint"><code><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de" lang="de" dir="ltr"> <head> blablabla... </head> <body> <div class="iframe"> <div id="block-menu-menu-primary-links-user" class="block-menu"> <h3>Primary Links - User</h3> <div class="content"><ul class="menu"><li class="leaf first"><a target="content" href="#someurl" title="">Login</a></li> <li class="leaf last"><a target="content" href="#someurl" title="">Register</a></li> </ul></div> </div> </div> </body> </html> </code></pre> <p>PS: i know that parsing HTML with regex is bad, but its not my code, i just need to fix it. </p>

<p>The reason it behaves differently is because of the subexpression capture you have using parenthesis. Other browsers add the match inside these captures to the resulting array, IE 8 and lower do not. To get a more consistent result, you'd have to make the group non-capturing:</p> <pre class="prettyprint"><code>/(?:<body>|<\/body>)/ig </code></pre> <p>This is the reason other browsers have the content in <code>[2]</code> rather than <code>[1]</code> — <code>[1]</code> will, in theory, contain the string <code>"<body>"</code>. The other browsers have it right on this one and Internet Explorer 9 fixed the problem by implementing the method as outlined by the ECMAScript 5th Edition specification.</p> <p>There are more inconsistencies than this, though. ECMAScript 5 compliance in all browsers will resolve these differences, but you might want to take a look at Steven Levithan's blog, where he outlines the differing implementations and even provides a custom <code>split()</code> method as a solution to the problem.</p>

different split Regex result in IE

Tags:

javascript

regex

i get some HTML it a as ajax response, and i need to get just the body contents. So i made this regex:

/(<body>|<\/body>)/ig

works well in all browser but for some reason IE gives me an other array when i use split:

data.split(/(<body>|<\/body>)/ig)

In all normal browsers the content of the body is split(/(<body>|<\/body>)/ig)[2] but in ie its in split(/(<body>|<\/body>)/ig)[1]. (tested in IE7 & 8)

Why is this? And how could i modify it, in order to get the same array in all browsers?

edit just to clarify. I alrady have a solution as mentioned by tobyodavies. I want to understandy, why it behaves differently.

this is the HTML from the response: (the string in data)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"  xml:lang="de"  lang="de" dir="ltr">
<head>
blablabla...
</head>
<body>
<div class="iframe">
   <div id="block-menu-menu-primary-links-user" class="block-menu">
 <h3>Primary Links - User</h3>  <div class="content"><ul class="menu"><li class="leaf first"><a target="content" href="#someurl" title="">Login</a></li>
<li class="leaf last"><a target="content" href="#someurl" title="">Register</a></li>
</ul></div>
</div>
</div>
</body>
</html>

PS: i know that parsing HTML with regex is bad, but its not my code, i just need to fix it.

306

asked Apr 04 '11 09:04

meo

1 Answers

The reason it behaves differently is because of the subexpression capture you have using parenthesis. Other browsers add the match inside these captures to the resulting array, IE 8 and lower do not. To get a more consistent result, you'd have to make the group non-capturing:

/(?:<body>|<\/body>)/ig

This is the reason other browsers have the content in [2] rather than [1] — [1] will, in theory, contain the string "<body>". The other browsers have it right on this one and Internet Explorer 9 fixed the problem by implementing the method as outlined by the ECMAScript 5th Edition specification.

There are more inconsistencies than this, though. ECMAScript 5 compliance in all browsers will resolve these differences, but you might want to take a look at Steven Levithan's blog, where he outlines the differing implementations and even provides a custom split() method as a solution to the problem.

172

answered Oct 05 '22 20:10

Andy E

Related questions
                            
                                jquery not working inside html file
                            
                                Get image height in IE of display:none image?
                            
                                jqgrid is row in edit mode
                            
                                Grabbing style.display property via JS only works when set inline?
                            
                                Force Chrome/Firefox into Full Screen?
                            
                                Trouble with Tornado and JavaScript Libraries
                            
                                javascript checkbox enable/disable
                            
                                "Resource interpreted as script but transferred with MIME type application/json" using Youtube's JavaScript API
                            
                                Flot and Internet Explorer 9?
                            
                                Is there any reason to do boolean casting with !! instead of Boolean() in JavaScript?
                            
                                removing space and retaining the new line?
                            
                                match exact string in a sentence
                            
                                What string date format will javascript's parse recognize?
                            
                                Disabling alert window in WebBrowser control
                            
                                why escaping / in javascript '<\/script>'?
                            
                                How to customize (or disable) the automatic "back" button in JQueryMobile
                            
                                HTML/JavaScript: How to make default text of a <textarea> undeletable?
                            
                                Iterate through and delete empty HTML table rows with jQuery
                            
                                Javascript and Database Connectivity
                            
                                Prototype chaining, Constructor, Inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With