I am dealing with a situation where I need users to enter various URLs (for example: for their profiles). However, users do not always insert URLs in the https://example.com
format. They might insert something like:
example.com
example.com/
example.com/somepage
[email protected]
or something else should not be acceptableHow can I normalize the URLs to a format that can potentially lead to a web address? I see this behavior in web browsers. We almost always enter crappy things in a web browser's bar and they can distinguish whether that's a search or something that can be turned into a URL.
I tried looking in many places but seems like I can't find any approach to this.
I would prefer a solution written for Node if it's possible. Thank you very much!
Use node's URL API, alongside some manual checks.
Example code:
const { URL } = require('url')
let myTestUrl = 'https://user:[email protected]:8080/p/a/t/h?query=string#hash';
try {
if (!myTestUrl.startsWith('https://') && !myTestUrl.startsWith('http://')) {
// The following line is based on the assumption that the URL will resolve using https.
// Ideally, after all checks pass, the URL should be pinged to verify the correct protocol.
// Better yet, it should need to be provided by the user - there are nice UX techniques to address this.
myTestUrl = `https://${myTestUrl}`
}
const normalizedUrl = new URL(myTestUrl);
if (normalizedUrl.username !== '' || normalized.password !== '') {
throw new Error('Username and password not allowed.')
}
// Do your thing
} catch (e) {
console.error('Invalid url provided', e)
}
I have only used http
and https
in this example, for a gist.
Straight from the docs, a nice visualisation of the API:
┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│ href │
├──────────┬──┬─────────────────────┬─────────────────────┬───────────────────────────┬───────┤
│ protocol │ │ auth │ host │ path │ hash │
│ │ │ ├──────────────┬──────┼──────────┬────────────────┤ │
│ │ │ │ hostname │ port │ pathname │ search │ │
│ │ │ │ │ │ ├─┬──────────────┤ │
│ │ │ │ │ │ │ │ query │ │
" https: // user : pass @ sub.host.com : 8080 /p/a/t/h ? query=string #hash "
│ │ │ │ │ hostname │ port │ │ │ │
│ │ │ │ ├──────────────┴──────┤ │ │ │
│ protocol │ │ username │ password │ host │ │ │ │
├──────────┴──┼──────────┴──────────┼─────────────────────┤ │ │ │
│ origin │ │ origin │ pathname │ search │ hash │
├─────────────┴─────────────────────┴─────────────────────┴──────────┴────────────────┴───────┤
│ href │
└─────────────────────────────────────────────────────────────────────────────────────────────┘
You want the normalize-url
package:
const normalizeUrl = require('normalize-url');
normalizeUrl('example.com/');
//=> 'http://example.com'
It runs a bunch of normalizations on the URL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With