I'm building a site that has products, each of which belongs to one or more categories, which can be nested within parent categories. I'd like to have SEO-friendly URLs, which look like this:
My question is: Is it safe to depend on a the presence of a trailing slash to differentiate between cases 2 and 3? Can I always assume the user wants a category index when a trailing slash is detected, vs a specific product's page with no trailing slash?
I'm not worried about implementing this URI scheme; I've already done as much with PHP and mod_rewrite. I'm simply wondering if anybody knows of any objections to this kind of URL routing. Are there any known issues with browsers stripping/adding trailing URLs from the address bar, or with search engines crawling such a site? Any SEO issues or other stumbling blocks that I'm likely to run into?
A trailing slash is a forward slash (“/”) placed at the end of a URL such as domain.com/ or domain.com/page/. The trailing slash is generally used to distinguish a directory which has the trailing slash from a file that does not have the trailing slash. However, these are guidelines and not requirements.
Historically, a trailing slash marked a directory and a URL without a trailing slash at the end used to mean that the URL was a file. Today, however, trailing slashes are purely conventional, and Google does not care whether you use them; as long as you're consistent.
Seeing one URL as canonical and another in the address bar. Specifically home pages. The trailing slash after a domain name is always there, even if it's not shown. It's always implied for canonicalization.
In addition to the other pitfall ideas you mentioned, the user might himself change the URL (by typing the product or category) and add/remove the trailing "/".
To solve your problem, why not have a special sub-category "all" and instead of "mysite.com/category/product" have "mysite.com/category/all/product"?
To me, it seems very unnatural that http://product/
and http://product
would represent two entirely different resources. It is confusing, and it makes your URLs less hackable, since it is difficult to tell when a trailing slash should be present or not.
Also, in RFC 3986, Uniform Resource Identifier (URI): Generic Syntax, there is a note on Protocol-Based Normalization in chapter 6.2.4, which talks about this particular situation with regard to non-human visitors of your site, such as search engines and web spiders:
Substantial effort to reduce the incidence of false negatives is often cost-effective for web spiders. Therefore, they implement even more aggressive techniques in URI comparison. For example, if they observe that a URI such as
http://example.com/data
redirects to a URI differing only in the trailing slash
http://example.com/data/
they will likely regard the two as equivalent in the future. (...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With