Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex using js to strip js from html

I'm using jQuery to sort a column of emails, though they are base64 encoded in js... so I need a regex command to ignore the <script>.*?<script> tags and only sort what is after them (within the <noscript> tags).

Column HTML

<td>
  <script type="text/javascript">
      document.write(Base64.decode('PG5vYnI+PGEgaHJlZj0ibWFpbHRvOmJpY2VAdWNzYy5lZHUiIHRpdGxlPSJiaWNlQHVjc2MuZWR1Ij5iaWNlPC9hPjwvbm9icj48YnIgLz4K'));
  </script>
  <noscript>username</noscript>
</td>

Regex that needs some love

a.replace(/<script.*?<\/script>(.*?)/i,"$1");
like image 484
Jeffrey Avatar asked Nov 04 '22 04:11

Jeffrey


1 Answers

Assuming that the structure of the html doesn't change, you can use this:

$(a)​.contents().filter(function(){
    return this.nodeType === 3
}).eq(1).text();

It gets all text nodes and then filters to the one at index 1 and get's it's text value.

And if you want to stick with regexp, here's one:

a.replace(/(<script type="text\/javascript">[^>]+>|<noscript>.*<\/noscript>)/ig,"");
like image 199
Kevin B Avatar answered Nov 07 '22 22:11

Kevin B