I was wondering whether search engines respect the HTTP header field Content-Location
.
This could be useful, for example, when you want to remove the session ID argument out of the URL:
GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…
HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…
Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.
Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.
I think Google just announced the answer to my question: the canonical
link relation for declaring the canonical URL.
Maile Ohye from Google wrote:
MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"@MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :
Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.
It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.
Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625
Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With