There are a number of fields a user can fill in where they'd enter a URL (their personal website, business site, favorite sites, etc etc).
It's the only thing they'd be entering in that particular field.
So should I always strip out "http://" to keep it consistent and to also reduce the possibility of broken links (ie. "http//")?
Just not sure what the best way to store URLs is.
You shouldn't have to do anything special to store a hyperlink within your database as they are simply strings. So you'll want to use a VARCHAR or TEXT field and you may want to consider making it fairly large (ie VARCHAR(512) or VARCHAR(MAX)) as URLS "can" be quite large although you may not run into any that big.
What data type is a URL? You should use a VARCHAR with an ASCII character encoding. URLs are percent encoded and international domain names use punycode so ASCII is enough to store them.
If there's a reason to sanitize your users' input (security, size, speed, accuracy...) then do it.
But otherwise, don't.
There's actually a benefit a lot of times in taking your customer-input data as-is. They own their own typos or misspellings, broken links, etc. that way. As long as it doesn't cause a problem for you (i.e. you don't have a reason to sanitize it).
BTW -- consistency is a moot point, as it won't change the data type, and you can easily check for the "http://" and add or remove it as necessary in your presentation layers with a re-usable function.
As far as I know you actually can not call it an "URL", without having the protocol part:
http://www.w3.org/Addressing/URL/url-spec.txt
I wouldn't remove it.
However if you really need to keep the data consistent, it really depends how the URL is actually typed in your application. If it's a browser-like application, I'd bet it can be assumed to be http:// in front if there is none, for valid links.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With