Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How determine if a string has been encoded programmatically in C#?

Tags:

c#

asp.net

How determine if a string has been encoded programmatically in C#?

Lets for example string:

<p>test</p>

I would like have my logic understand that this value it has been encoded.. Any ideas? Thanks

like image 380
GibboK Avatar asked Dec 22 '10 09:12

GibboK


4 Answers

You can use HttpUtility.HtmlDecode() to decode the string, then compare the result with the original string. If they're different, the original string was probably encoded (at least, the routine found something to decode inside):

public bool IsHtmlEncoded(string text)
{
    return (HttpUtility.HtmlDecode(text) != text);
}
like image 62
Frédéric Hamidi Avatar answered Nov 16 '22 07:11

Frédéric Hamidi


Strictly speaking that's not possible. What the string contains might actually be the intended text, and the encoded version of that would be <p>test</p>.

You could look for HTML entities in the string, and decode it until there are no left, but it's risky to decode data that way, as it's assuming things that might not be true.

like image 33
Guffa Avatar answered Nov 16 '22 06:11

Guffa


this is my take on it... if the user passes in partially encoded text, this'll catch it.

private bool EncodeText(string val)
        {
            string decodedText = HttpUtility.HtmlDecode(val);
            string encodedText = HttpUtility.HtmlEncode(decodedText);

            return encodedText.Equals(val, StringComparison.OrdinalIgnoreCase);

        }
like image 32
elvis Avatar answered Nov 16 '22 07:11

elvis


I use the NeedsEncoding() method below to determine whether a string needs encoding.

Results 
-----------------------------------------------------
b               -->      NeedsEncoding = True
<b>          -->      NeedsEncoding = True
<b>             -->      NeedsEncoding = True
&lt;b&lt;       -->      NeedsEncoding = False
&quot;          -->      NeedsEncoding = False

Here are the helper methods, I split it into two methods for clarity. Like Guffa says it is risky and hard to produce a bullet proof method.

    public static bool IsEncoded(string text)
    {
        // below fixes false positive &lt;<> 
        // you could add a complete blacklist, 
        // but these are the ones that cause HTML injection issues
        if (text.Contains("<")) return false;
        if (text.Contains(">")) return false;
        if (text.Contains("\"")) return false;
        if (text.Contains("'")) return false;
        if (text.Contains("script")) return false;

        // if decoded string == original string, it is already encoded
        return (System.Web.HttpUtility.HtmlDecode(text) != text);
    }

    public static bool NeedsEncoding(string text)
    {
        return !IsEncoded(text);
    }
like image 2
Ian G Avatar answered Nov 16 '22 06:11

Ian G