Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PathCanonicalize equivalent in C#

Tags:

c#

filenames

What is the equivalent to PathCanonicalize in C#?

Use: I need to take a good guess whether two file paths refer to the same file (without disk access). My typical approach has been throwing it through a few filters like MakeAbsolute and PathCanonicalize, and then do a case-insensitive comparison.

like image 629
peterchen Avatar asked Mar 08 '09 09:03

peterchen


People also ask

What is Canonicalize path?

The canonical path is always an absolute and unique path. If String pathname is used to create a file object, it simply returns the pathname. This method first converts this pathname to absolute form if needed. To do that it will invoke the getAbsolutePath() Method and then maps it to its unique form.

What is a simplified canonical path?

The canonical path should have the following format: The path starts with a single slash '/' . Any two directories are separated by a single slash '/' . The path does not end with a trailing '/' .

Is absolute an path?

A path is either relative or absolute. An absolute path always contains the root element and the complete directory list required to locate the file. For example, /home/sally/statusReport is an absolute path. All of the information needed to locate the file is contained in the path string.

What is a canonical file?

Canonical means "unique" or "unique representation". Since Windows OS is not case sensitive there cannot be a single unique representation of any path, by definition. It can be absolute, but not canonical.


2 Answers

Path.GetFullPath() does not work with relative paths. I've been looking for a solution that works for relative paths as well.

I tried many methods but none of them worked. @Paul's third strategy does not work with linux \\ and has a bug with relative paths as it introduces one more folder, you lose one .. in the result.

Here's the solution that works with both relative + absolute paths. It works on both Linux + Windows and it keeps the .. as expected in the beginning of the text (at rest they will be normalized). The solution still relies on Path.GetFullPath to do the fix with a small workaround.

It's an extension method so use it like text.Canonicalize()

/// <summary>
///     Fixes "../.." etc
/// </summary>
public static string Canonicalize(this string path)
{
    if (path.IsAbsolutePath())
        return Path.GetFullPath(path);
    var fakeRoot = Environment.CurrentDirectory; // Gives us a cross platform full path
    var combined = Path.Combine(fakeRoot, path);
    combined = Path.GetFullPath(combined);
    return combined.RelativeTo(fakeRoot);
}
private static bool IsAbsolutePath(this string path)
{
    if (path == null) throw new ArgumentNullException(nameof(path));
    return
        Path.IsPathRooted(path)
        && !Path.GetPathRoot(path).Equals(Path.DirectorySeparatorChar.ToString(), StringComparison.Ordinal)
        && !Path.GetPathRoot(path).Equals(Path.AltDirectorySeparatorChar.ToString(), StringComparison.Ordinal);
}
private static string RelativeTo(this string filespec, string folder)
{
    var pathUri = new Uri(filespec);
    // Folders must end in a slash
    if (!folder.EndsWith(Path.DirectorySeparatorChar.ToString())) folder += Path.DirectorySeparatorChar;
    var folderUri = new Uri(folder);
    return Uri.UnescapeDataString(folderUri.MakeRelativeUri(pathUri).ToString()
        .Replace('/', Path.DirectorySeparatorChar));
}
like image 99
U. Bulle Avatar answered Oct 06 '22 00:10

U. Bulle


3 solutions:

Best case scenario, where you are 100% certain the calling process will have full access to the filesystem. CAVEAT: permission on a production box can be tricky

    public static string PathCombineAndCanonicalize1(string path1, string path2)
    {
        string combined = Path.Combine(path1, path2);
        combined = Path.GetFullPath(combined);
        return combined;
    }

But, we're not always free. Often you need to do the string arithmetic without permission. There is a native call for this. CAVEAT: resorts to native call

    public static string PathCombineAndCanonicalize2(string path1, string path2)
    {
        string combined = Path.Combine(path1, path2);
        StringBuilder sb = new StringBuilder(Math.Max(260, 2 * combined.Length));
        PathCanonicalize(sb, combined);
        return sb.ToString();
    }

    [DllImport("shlwapi.dll", CharSet = CharSet.Auto, SetLastError = true)]
    private static extern bool PathCanonicalize([Out] StringBuilder dst, string src);

A third strategy is to trick the CLR. Path.GetFullPath() works just fine on a fictitious path, so just make sure you're always giving it one. What you can do is to swap out the root with a phony UNC path, call GetFullPath(), and then swap the real one back in. CAVEAT: this may require a hard sell in code review

    public static string PathCombineAndCanonicalize3(string path1, string path2)
    {
        string originalRoot = string.Empty;

        if (Path.IsPathRooted(path1))
        {
            originalRoot = Path.GetPathRoot(path1);
            path1 = path1.Substring(originalRoot.Length);
        }

        string fakeRoot = @"\\thiscantbe\real\";
        string combined = Path.Combine(fakeRoot, path1, path2);
        combined = Path.GetFullPath(combined);
        combined = combined.Substring(fakeRoot.Length);
        combined = Path.Combine(originalRoot, combined);
        return combined;
    }
like image 36
Paul Williams Avatar answered Oct 05 '22 23:10

Paul Williams