<p>I'm trying to use a Regex expression I've found in this website and it doesn't seem to work. Any ideas?</p> <p><strong>Input string</strong>:</p> <pre class="prettyprint"><code>sFetch = "123<script type=\"text/javascript\">\n\t\tfunction utmx_section(){}function utmx(){}\n\t\t(function()})();\n\t</script>456"; </code></pre> <p><strong>Regex</strong>:</p> <pre class="prettyprint"><code>sFetch = Regex.Replace(sFetch, "<script.*?>.*?</script>", "", RegexOptions.IgnoreCase); </code></pre>

<p>Add <code>RegexOptions.Singleline</code></p> <pre class="prettyprint"><code>RegexOptions.IgnoreCase | RegexOptions.Singleline </code></pre> <p>And that will never work on follow one.</p> <pre class="prettyprint"><code><script > alert(1) </script /**/ > </code></pre> <p>So, Find a HTML parser like HTML Agility Pack</p>

<p>The reason the regex fails is that your input has <code>newlines</code> and the meta char <code>.</code> does not match it.</p> <p>To solve this you can use the <code>RegexOptions.Singleline</code> option as S.Mark says, or you can change the regex to:</p> <pre class="prettyprint"><code>"<script[\d\D]*?>[\d\D]*?</script>" </code></pre> <p>which used <code>[\d\D]</code> instead of <code>.</code>.</p> <p><code>\d</code> is any digit and <code>\D</code> is any non-digit, so <code>[\d\D]</code> is a digit or a non-digit which is effectively any char.</p>

Using Regex to remove script tags

Tags:

c#

regex

I'm trying to use a Regex expression I've found in this website and it doesn't seem to work. Any ideas?

Input string:

Click to copy

sFetch = "123<script type=\"text/javascript\">\n\t\tfunction utmx_section(){}function utmx(){}\n\t\t(function()})();\n\t</script>456";

Regex:

Click to copy

sFetch = Regex.Replace(sFetch, "<script.*?>.*?</script>", "", RegexOptions.IgnoreCase);

647

asked Mar 24 '10 07:03

amitre

2 Answers

Add RegexOptions.Singleline

Click to copy

RegexOptions.IgnoreCase | RegexOptions.Singleline

And that will never work on follow one.

Click to copy

<script
>
alert(1)
</script
/**/
>

So, Find a HTML parser like HTML Agility Pack

124

answered Oct 04 '22 13:10

YOU

The reason the regex fails is that your input has newlines and the meta char . does not match it.

To solve this you can use the RegexOptions.Singleline option as S.Mark says, or you can change the regex to:

Click to copy

"<script[\d\D]*?>[\d\D]*?</script>"

which used [\d\D] instead of ..

\d is any digit and \D is any non-digit, so [\d\D] is a digit or a non-digit which is effectively any char.

answered Oct 04 '22 13:10

codaddict

Related questions
                            
                                C# Foreach Loop - Continue Issue
                            
                                Adjust FlowDocumentReader's Scroll Increment When ViewingMode Set to Scroll?
                            
                                VS 2008 Addon to temporarily disable/remove all catch block
                            
                                How to get a -0 result in floating point calculations and distinguish it from +0 in C#?
                            
                                How can I trigger the default button on a form without clicking it (winforms)?
                            
                                MVVM pattern, IDataErrorInfo and Binding to display error?
                            
                                Overload the += event operator
                            
                                What are the benefits of implicit typing in C# 3.0 >+
                            
                                DataGridViewColumn initial sort direction
                            
                                email validation in a c# winforms application
                            
                                How to retrieve the scrollbar position of the webbrowser control in .NET
                            
                                is there a Way to strip all Unnecessary MS Word Formatting from FCKEditor
                            
                                Meaning of "this" for a struct (C#)
                            
                                C#: passing nullable variable to method that only accepts nonnull vars
                            
                                INotifyPropertyChanged with threads
                            
                                Binding DropDownList to ListItemCollection and the Value not being added to the DDL
                            
                                What's the fastest way to convert an existing Vb6.0 win-based application into a c# win-based?
                            
                                Why does the "Assert" class have so many seemingly redundant methods? When should each be used?
                            
                                If a method returns an interface, what does it mean?
                            
                                MM/dd/yyyy format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using Regex to remove script tags

Tags:

c#

regex

amitre

People also ask

2 Answers

YOU

codaddict

Recent Activity

Donate For Us