Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it always safe to remove a trailing slash from a URL?

Tags:

url

database

I'm storing URLs in a database, and I want to be able to know if two URLs are identical.

Generally, a trailing slash at the end doesn't change the response you'd get from a server. (ie. http://www.google.com/ is the same as http://www.google.com)

Can I always blindly remove the trailing slash from any URL, without looking at anything?
Is that safe?

What I mean by "without looking at anything" is that I'd remove the slash from:
http://www.google.com/q?xxx=something&yyy=something/

I know the web server could theoretically return completely different things if it wanted, and I know sometimes going to a URL without the slash will redirect to one with the slash. My only intention here is determining if both URLs are the same.

Is this method safe?

like image 705
Daniel Magliola Avatar asked Jun 03 '09 02:06

Daniel Magliola


People also ask

Does trailing slash matter in URL?

The trailing slash does not matter for your root domain or subdomain. Google sees the two as equivalent. But trailing slashes do matter for everything else because Google sees the two versions (one with a trailing slash and one without) as being different URLs.

Does Google care about trailing slash?

Google treats each URL above separately (and equally) regardless of whether it's a file or a directory, or it contains a trailing slash or it doesn't contain a trailing slash.


2 Answers

No it is not always safe. A web server could interpret the path part of the URL anyway it likes. You cannot tell what it will do (resolve the URI) without using a GET or HEAD on the URL.

like image 69
dajobe Avatar answered Oct 14 '22 04:10

dajobe


It may be safe in the sense that you'll get the same response with or without a trailing slash (and I can't guarantee that's true), but they can definitely mean different things. Consider a URL that references a directory, or something presented by the site as a directory. Using the URL

http://www.somesite.com/directory/

...makes it clear you're asking for a directory. If you hack off the trailing slash:

http://www.somesite.com/directory

...the site's going to take this as a request for a file called "directory", and get all confused for a moment. It'll likely interpret this as a request for a directory, but the meanings are not the same, and you might not get what you expect.

See this article for more detail.

like image 7
Michael Petrotta Avatar answered Oct 14 '22 04:10

Michael Petrotta