Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using commas in URL's can break the URL sometimes?

Is anyone aware of any problems with using commas in SEO friendly URL's? I'm working with some software that uses a lot of commas in it's SEO friendly URL's; but I am 100% certain I have seen some instances where some programs/platforms don't recognize the URL correctly & cut the "linking" of the URL off after the first comma.

I just tested this out with thunderbird, gmail, hotmail & on a SMF forum with no problems; however I know I have seen the issue before.

So my question is, is there anything in particular that would cause some platforms to stop linking URL's with a comma? Such as a certain character after the comma?

like image 821
Brett Avatar asked Feb 02 '13 16:02

Brett


People also ask

Can URLs have commas?

Answer: While it's definitely possible to use commas in URLs, it's not a widely used practice, nor is it recommended. When it comes to most online users, anything out of the ordinary can make them wary of a Web site. And with our example above, just seeing a comma-delineated URL may cause site visitors to click away.

What symbols can you use in a URL?

A URL is composed of a limited set of characters belonging to the US-ASCII character set. These characters include digits (0-9), letters(A-Z, a-z), and a few special characters ( "-" , "." , "_" , "~" ). When these characters are not used in their special role inside a URL, they must be encoded.

Can you have _ in URL?

Using a Dash or an Underscore in Urls: What You Need to KnowNo, it's not possible for you to do so. Using a hyphen in your URLs is recommended by Google, because it makes your website easy to read for humans. As an end result, this means that your site will place better on search engines.


1 Answers

There will be countless implementations that will cut the automatical linking at that point. As with many other characters, too. But that’s not a problem because of using these characters, but because of a wrong/incomplete implementation.

See for example this very site, Stack Overflow. It will cut off the link at the * when manually entering/pasting this URL (see bug; in case it gets fixed, here’s a screenshot of it):

  • http://wayback.archive.org/web/*/http://www.example.com/

But when using the hyperlink syntax, it works fine:

  • http://wayback.archive.org/web/*/http://www.example.com/

The * character is allowed in an HTTP URL path, so the link detection should have recognized the first URL instead of breaking it at the occurence of *.


Regarding the comma:

The comma is a reserved character and its meaning is relevant for the URL path (bold emphasis mine):

Aside from dot-segments in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same.

So, if you don’t intend to use the comma for the function it has as reserved character, you may want to percent-encode it with %2C. Users copying such an URL from their browser’s address bar would paste it in the encoded form, so it should work almost everywhere.

However, especially because it’s a reserved character, the unencoded form should work, too.

like image 175
unor Avatar answered Sep 23 '22 14:09

unor