Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to encode space in the fragment identifier in a URL

In the fragment identifier in a URL, should a space be encoded as %20 like in path or as + like in query string?

like image 687
Desmond Hume Avatar asked Oct 02 '22 02:10

Desmond Hume


1 Answers

For HTML pages, they should be percent-encoded.

According to RFC 2396, RFC 3986, and RFC 7320, the format of fragment identifiers depends on the media type. From RFC 2396 and RFC 3986:

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced.

From RFC 7320:

Media type definitions (as per [RFC6838]) SHOULD specify the fragment identifier syntax(es) to be used with them; other specifications MUST NOT define structure within the fragment identifier, unless they are explicitly defining one for reuse by media type definitions.

The HTML5 spec only specifies percent encoding:

The indicated part of the document is the one that the fragment identifier, if any, identifies. The semantics of the fragment identifier in terms of mapping it to a specific DOM Node is defined by the specification that defines the MIME type used by the Document (for example, the processing of fragment identifiers for XML MIME types is the responsibility of RFC7303).

For HTML documents (and HTML MIME types), the following processing model must be followed to determine what the indicated part of the document is.

  1. Apply the URL parser algorithm to the URL, and let fragid be the fragment component of the resulting parsed URL.
  1. If fragid is the empty string, then the indicated part of the document is the top of the document; stop the algorithm here.
  1. Let fragid bytes be the result of percent-decoding fragid.
  1. Let decoded fragid be the result of applying the UTF-8 decoder algorithm to fragid bytes. If the UTF-8 decoder emits a decoder error, abort the decoder and instead jump to the step labeled no decoded fragid.
  2. [...]

(emphasis mine)

For XML documents, RFC 7303 specifies the syntax of the XPointer Framework which also requires percent encoding for reserved URI characters.

Other media types may have different rules.

like image 197
nwellnhof Avatar answered Oct 07 '22 18:10

nwellnhof