I'm writing a web application that dynamically creates URL's based off of some input, to be consumed by a client at another time. For discussion sake these URL's can contain certain characters, like a forward slash (i.e. '/'), which should not be interpreted as part of the actual URL, but just as an argument. For example:
http://mycompany.com/PartOfUrl1/PartOfUrl2/ArgumentTo/Url/GoesHere
As you can see, the ArgumentTo/Url/GoesHere does indeed have forward slashes but these should be ignored or escaped.
This may be a bad example but the question in hand is more general and applies to other special characters.
Given some of the answers I realized that I failed to point out a few pieces that hopefully will help clarify.
I would like to keep this fairly language agnostic as it would be great if the client could just make a request. For example, if the client knew that it wanted to pass ArgumentTo/Url/GoesHere, it would be great if that could be encoded into a unique string in which the server could turn around and decode it to use.
Can we assume that similar functions like HttpUtility.HtmlEncode/HtmlDecode in the .NET Framework are available on other systems/platforms? The URL does not have to be pretty by any means so having real words in the path does not really matter.
It seems that base64 encoding/decoding is fairly readily available on any platform/language.
You didn't say which language you're using, but PHP has the useful urlencode
function and C# has HttpUtility.URLEncode
and Server.UrlEncode
which should encode parts of your URL nicely.
In case you need another way this page has a list of encoded values. E.g.: / == %2f
.
From what you've updated I'd say use Voyagerfan's idea of URLRewriting to make something like:
http://www.example.com/([A-Za-z0-9/]+) http://www.example.com/?page=$1
And then use the applications GET parser to filter it out.
You could use Apache rewrites to rewrite http:// mycompany.com/PartOfUrl1/PartOfUrl2
to http:// mycompany.com/path/to/program.php
and then pass in ArgumentTo/Url/GoesHere
as a standard GET parameter. So what the server actually sends back is the response for http:// mycompany.com/path/to/program.php?arg=ArgumentTo/Url/GoesHere
Rewriting is a good way to guard against technology changes (so switching from PHP to ASP, for example, won't change your URLs) and provide friendly URLs to your users at the same time.
Using your example URLs and building on what I said before, I'd say to use this code in your httpd.conf or .htaccess:
RewriteEngine On
RewriteRule http:// mycompany.com/PartOfUrl1/PartOfUrl2/([A-Za-z0-9]) http://mycompany.com/path/to/program.php?arg=$1
(BTW, remove the space after the first http://
in the RewriteRule
, plus that line needs to contain no line breaks.)
Changing the paths, the filenames, name of the arg, etc. is fine; the critical parts here are the regex (([A-Za-z0-9])
) and the $1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With