Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Safe/Allowed filename cleaner for .NET

Is there any standardized / libraried / tested way in .NET to to take an arbitrary string and mangle it in such a way that it represents a valid file name?

Rolling my own char-replace function is easy enough, but I'd like something a little more robust and resued.

like image 462
Craig Walker Avatar asked Dec 07 '09 21:12

Craig Walker


1 Answers

You can use Path.GetInvalidFileNameChars to check out which characters of the string are invalid, and either convert them to a valid char such as a hyphen, or (if you need bidirectional conversion) substitute them by a escape token such as %, followed the hexadecimal representation of their unicode codes (I have actually used this technique once but don't have the code at hand right now).

EDIT: Just in case someone is interested, here is the code.

/// <summary>
/// Escapes an object name so that it is a valid filename.
/// </summary>
/// <param name="fileName">Original object name.</param>
/// <returns>Escaped name.</returns>
/// <remarks>
/// All characters that are not valid for a filename, plus "%" and ".", are converted into "%uuuu", where uuuu is the hexadecimal
/// unicode representation of the character.
/// </remarks>
private string EscapeFilename(string fileName)
{
    char[] invalidChars=Path.GetInvalidFileNameChars();

    // Replace "%", then replace all other characters, then replace "."

    fileName=fileName.Replace("%", "%0025");
    foreach(char invalidChar in invalidChars)
    {
        fileName=fileName.Replace(invalidChar.ToString(), string.Format("%{0,4:X}", Convert.ToInt16(invalidChar)).Replace(' ', '0'));
    }
    return fileName.Replace(".", "%002E");
}

/// <summary>
/// Unescapes an escaped file name so that the original object name is obtained.
/// </summary>
/// <param name="escapedName">Escaped object name (see the EscapeFilename method).</param>
/// <returns>Unescaped (original) object name.</returns>
public string UnescapeFilename(string escapedName)
{
    //We need to temporarily replace %0025 with %! to prevent a name
    //originally containing escaped sequences to be unescaped incorrectly
    //(for example: ".%002E" once escaped is "%002E%0025002E".
    //If we don't do this temporary replace, it would be unescaped to "..")

    string unescapedName=escapedName.Replace("%0025", "%!");
    Regex regex=new Regex("%(?<esc>[0-9A-Fa-f]{4})");
    Match m=regex.Match(escapedName);
    while(m.Success)
    {
        foreach(Capture cap in m.Groups["esc"].Captures)
            unescapedName=unescapedName.Replace("%"+cap.Value, Convert.ToChar(int.Parse(cap.Value, NumberStyles.HexNumber)).ToString());
        m=m.NextMatch();
    }
    return unescapedName.Replace("%!", "%");
}
like image 54
Konamiman Avatar answered Oct 21 '22 06:10

Konamiman