Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The origin on why '%20' is used as a space in URLs

Tags:

url

encoding

I am interested in knowing why '%20' is used as a space in URLs, particularly why %20 was used and why we even need it in the first place.

like image 281
orange Avatar asked Dec 16 '12 11:12

orange


People also ask

Why is space a %20 on URL?

A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.

What does space in a URL mean?

In the relevant RFC 3986, spaces are defined as 'unsafe characters'. It is stipulated that spaces must not be left untreated in a URL and must instead be converted (encoded). Special characters in URLs are usually expressed using the percent sign and a sequence of numbers.

Why can URLs have spaces?

As per RFC 1738: Unsafe: Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs.


2 Answers

It's called percent encoding. Some characters can't be in a URI (for example #, as it denotes the URL fragment), so they are represented with characters that can be (# becomes %23)

Here's an excerpt from that same article:

When a character from the reserved set (a "reserved character") has special meaning (a "reserved purpose") in a certain context, and a URI scheme says that it is necessary to use that character for some other purpose, then the character must be percent-encoded. Percent-encoding a reserved character involves converting the character to its corresponding byte value in ASCII and then representing that value as a pair of hexadecimal digits. The digits, preceded by a percent sign ("%") which is used as an escape character, are then used in the URI in place of the reserved character. (For a non-ASCII character, it is typically converted to its byte sequence in UTF-8, and then each byte value is represented as above.)

The space character's character code is 32:

> ' '.charCodeAt(0) 32 

Which is 20 in base-16:

> ' '.charCodeAt(0).toString(16) "20" 

Tack a percent sign in front of it and you get %20.

like image 177
Blender Avatar answered Oct 05 '22 05:10

Blender


Because URLs have strict syntactic rules, like / being a special path separator character, spaces not being allowed in a URL and all characters having to be a certain subset of ASCII. To embed arbitrary characters in URLs regardless of these restrictions, bytes can be percent encoded. The byte x20 represents a space in the ASCII encoding (and most other encodings), hence %20 is the URL-encoded version of it.

like image 31
deceze Avatar answered Oct 05 '22 06:10

deceze