Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing URI in .NET

Tags:

c#

uri

Short Version

Is there a class in .NET that can parse Uri's?

Background

The Windows Search service registers content to crawl through the use of URIs. Using ISearchCrawlScopeManager you can enumerate the various root uri's:

  • csc://{S-1-5-21-397955417-62688126-188441444-1010}/
  • defaultroot://{S-1-5-21-397955417-62688126-188441444-1010}/
  • file:///C:\
  • file:///D:\
  • iehistory://{S-1-5-21-397955417-62688126-188441444-1010}/
  • mapi://{S-1-5-21-397955417-62688126-188441444-1010}/Outlook2003/Inbox/
  • winrt://{S-1-5-21-397955417-62688126-188441444-1010}/

Unfortunately the .NET Uri class is unable to parse these Uri's (dotNetFiddle):

Run-time exception (line 8): Invalid URI: The hostname could not be parsed.

Stack Trace:

[System.UriFormatException: Invalid URI: The hostname could not be parsed.]
   at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)

Is there a class in .NET that can parse Uri's?

The native Win32 function InternetCrackUrl is able to correctly handle the Uri:

URL_COMPONENTS components;
components.dwStructSize      = sizeof(URL_COMPONENTS );
components.dwSchemeLength    = DWORD(-1);
components.dwHostNameLength  = DWORD(-1);
components.dwUserNameLength  = DWORD(-1);
components.dwPasswordLength  = DWORD(-1);
components.dwUrlPathLength   = DWORD(-1);
components.dwExtraInfoLength = DWORD(-1);

InternetCrackUrl(url, Length(url), 0, ref components);

mapi://{S-1-5-21-397955417-62688126-188441444-1010}/Outlook2003/Inbox/
\__/   \__________________________________________/\_________________/
 |                           |                              _
Scheme                    HostName                       UrlPath

Scheme:   "mapi"
HostName: "{S-1-5-21-397955417-62688126-188441444-1010}"
UrlPath:  "/Outlook2003/Inbox/"

Bonus Chatter

Apply Uri escaping to a uri:

  • Before: mapi://{S-1-5-21-397955417-62688126-188441444-1010}/Outlook2003/Inbox/
  • After: mapi://%7BS-1-5-21-397955417-62688126-188441444-1010%7D/Outlook2003/Inbox/

doesn't help (dotNetFiddle).

Difference between Uri and Url?

Urls are a subset of Uris

  • Uri tells you a thing
  • Url tells you where to get a thing

E.g.:

  • URI: isbn:1631971727 (identifies a thing)
    • URL: isbn://amazon.com/1631971727 (where to get a thing)

Url

The breakdown of a URL is:

  foo://iboyd:[email protected]:8042/look/over/there?name=ferret#nose
  \_/   \___/ \______/ \_________/ \__/\______________/\__________/ \__/
   |      |      |         |        |         |            |         |
scheme username password  host     port     path         query    fragment
  • Scheme: foo
  • Username: iboyd
  • Password: Trubador
  • Host: example.com
  • Port: 8042
  • Path: /look/over/there
  • Query: ?name=ferret
  • Fragment: nose
like image 962
Ian Boyd Avatar asked Nov 08 '22 05:11

Ian Boyd


1 Answers

The method ResolveHelper()that called by CreateThis() as you see in the stack trace is indentifing it as an absolute uri hence it throws an exception.

change your uri from:

mapi://{S-1-5-21-397955417-62688126-188441444-1010}/Outlook2003/Inbox/

to:

mapi:////{S-1-5-21-397955417-62688126-188441444-1010}/Outlook2003/Inbox/

.Net source code - ResolveHelper() method

**

FROM Reference Source .NET Framework 4.7.2:

**

internal static Uri ResolveHelper(Uri baseUri, Uri relativeUri, ref string newUriString, ref bool userEscaped, 
            out UriFormatException e)
        {
            Debug.Assert(!baseUri.IsNotAbsoluteUri && !baseUri.UserDrivenParsing, "Uri::ResolveHelper()|baseUri is not Absolute or is controlled by User Parser.");

            e = null;
            string relativeStr = string.Empty;

            if ((object)relativeUri != null)
            {
                if (relativeUri.IsAbsoluteUri)
                    return relativeUri;

                relativeStr = relativeUri.OriginalString;
                userEscaped = relativeUri.UserEscaped;
            }
            else
                relativeStr = string.Empty;

            // Here we can assert that passed "relativeUri" is indeed a relative one

            if (relativeStr.Length > 0 && (IsLWS(relativeStr[0]) || IsLWS(relativeStr[relativeStr.Length - 1])))
                relativeStr = relativeStr.Trim(_WSchars);

            if (relativeStr.Length == 0)
            {
                newUriString = baseUri.GetParts(UriComponents.AbsoluteUri, 
                    baseUri.UserEscaped ? UriFormat.UriEscaped : UriFormat.SafeUnescaped);
                return null;
            }

            // Check for a simple fragment in relative part
            if (relativeStr[0] == '#' && !baseUri.IsImplicitFile && baseUri.Syntax.InFact(UriSyntaxFlags.MayHaveFragment))
            {
                newUriString = baseUri.GetParts(UriComponents.AbsoluteUri & ~UriComponents.Fragment, 
                    UriFormat.UriEscaped) + relativeStr;
                return null;
            }

            // Check for a simple query in relative part
            if (relativeStr[0] == '?' && !baseUri.IsImplicitFile && baseUri.Syntax.InFact(UriSyntaxFlags.MayHaveQuery))
            {
                newUriString = baseUri.GetParts(UriComponents.AbsoluteUri & ~UriComponents.Query & ~UriComponents.Fragment, 
                    UriFormat.UriEscaped) + relativeStr;
                return null;
            }

            // Check on the DOS path in the relative Uri (a special case)
            if (relativeStr.Length >= 3
                && (relativeStr[1] == ':' || relativeStr[1] == '|')
                && IsAsciiLetter(relativeStr[0])
                && (relativeStr[2] == '\\' || relativeStr[2] == '/'))
            {

                if (baseUri.IsImplicitFile)
                {
                    // It could have file:/// prepended to the result but we want to keep it as *Implicit* File Uri
                    newUriString = relativeStr;
                    return null;
                }
                else if (baseUri.Syntax.InFact(UriSyntaxFlags.AllowDOSPath))
                {
                    // The scheme is not changed just the path gets replaced
                    string prefix;
                    if (baseUri.InFact(Flags.AuthorityFound))
                        prefix = baseUri.Syntax.InFact(UriSyntaxFlags.PathIsRooted) ? ":///" : "://";
                    else
                        prefix = baseUri.Syntax.InFact(UriSyntaxFlags.PathIsRooted) ? ":/" : ":";

                    newUriString = baseUri.Scheme + prefix + relativeStr;
                    return null;
                }
                // If we are here then input like "http://host/path/" + "C:\x" will produce the result  http://host/path/c:/x
            }


            ParsingError err = GetCombinedString(baseUri, relativeStr, userEscaped, ref newUriString);

            if (err != ParsingError.None)
            {
                e = GetException(err);
                return null;
            }

            if ((object)newUriString == (object)baseUri.m_String)
                return baseUri;

            return null;
        }
like image 128
Jonathan Applebaum Avatar answered Nov 14 '22 23:11

Jonathan Applebaum