I'm not a regex expert and I'm breaking my head trying to do one that seems very simple and works in python 2.7: validate the path of an URL (no hostname) without the query string. In other words, a string that starts with /, allows alphanumeric values and doesn't allow any other special chars except these: /
, .
, -
I found this post that is very similar to what I need but for me isn't working at all, I can test with for example aaa
and it will return true even if it doesn't start with /
.
The current regex that I have kinda working is this one:
[^/+a-zA-Z0-9.-]
but it doesn't work with paths that don't start with /
. For example:
/aaa
-> true, this is ok/aaa/bbb
-> true, this is ok/aaa?q=x
-> false, this is okaaa
-> true, this is NOT okYou can use the URLConstructor to check if a string is a valid URL. URLConstructor ( new URL(url) ) returns a newly created URL object defined by the URL parameters. A JavaScript TypeError exception is thrown if the given URL is not valid.
URL regular expressions can be used to verify if a string has a valid URL format as well as to extract an URL from a string.
To validate a field with a Regex pattern, click the Must match pattern check box. Next, add the expression you want to validate against. Then add the message your users will see if the validation fails. You can save time for your users by including formatting instructions or examples in the question description.
The regex you've defined is a character class. Instead, try:
^\/[/.a-zA-Z0-9-]+$
In other words, a string that starts with /, allows alphanumeric values and doesn't allow any other special chars except these: /, ., -
You are missing some characters that are valid in URLs
import string
import urllib
import urlparse
valid_chars = string.letters + string.digits + '/.-~'
valid_paths = []
urls = ['http://www.my.uni.edu/info/matriculation/enroling.html',
'http://info.my.org/AboutUs/Phonebook',
'http://www.library.my.town.va.us/Catalogue/76523471236%2Fwen44--4.98',
'http://www.my.org/462F4F2D4241522A314159265358979323846',
'http://www.myu.edu/org/admin/people#andy',
'http://www.w3.org/RDB/EMP?*%20where%20name%%3Ddobbins']
for i in urls:
path = urllib.unquote(urlparse.urlparse(i).path)
if path[0] == '/' and len([i for i in path if i in valid_chars]) == len(path):
valid_paths.append(path)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With