Are there any other characters except A-Za-z0-9 that can be used to shorten links without getting into trouble? :)
I was thinking about +,;- or something.
Is there a defined standard regarding what characters can be used in a URL that browser vendors respect?
These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`". All unsafe characters must always be encoded within a URL.
Special characters needing encoding are: ':' , '/' , '?' , '#' , '[' , ']' , '@' , '!' , '$' , '&' , "'" , '(' , ')' , '*' , '+' , ',' , ';' , '=' , as well as '%' itself.
Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.
A path segment (the parts in a path separated by /
) in an absolute URI path can contain zero or more of pchar that is defined as follows:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@" pct-encoded = "%" HEXDIG HEXDIG unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
So it’s basically A
–Z
, a
–z
, 0
–9
, -
, .
, _
, ~
, !
, $
, &
, '
, (
, )
, *
, +
, ,
, ;
, =
, :
, @
, as well as %
that must be followed by two hexadecimal digits. Any other character/byte needs to be encoded using the percent-encoding.
Although these are 79 characters in total that can be used in a path segment literally, some user agents do encode some of these characters as well (e.g. %7E
instead of ~
). That’s why many use just the 62 alphanumeric characters (i.e. A
–Z
, a
–z
, 0
–9
) or the Base 64 Encoding with URL and Filename Safe Alphabet (i.e. A
–Z
, a
–z
, 0
–9
, -
, _
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With