I'm using the Django URLValidator
in the following way in a form:
def clean_url(self):
validate = URLValidator(verify_exists=True)
url = self.cleaned_data.get('url')
try:
logger.info(url)
validate(url)
except ValidationError, e:
logger.info(e)
raise forms.ValidationError("That website does not exist. Please try again.")
return self.cleaned_data.get('url')
It seems to work with some url's but for some valid ones, it fails. I was able to check with http://www.amazon.com/ it's failing (which is obviously incorrect). It passes with http://www.cisco.com/. Is there any reason for the bogus errors?
Look at the source for URLValidator
; if you specify check_exists
, it makes a HEAD
request to the URL to check if it's valid:
req = urllib2.Request(url, None, headers)
req.get_method = lambda: 'HEAD'
...
opener.open(req, timeout=10)
Try making the HEAD
request to Amazon yourself, and you'll see the problem:
carl@chaffinch:~$ HEAD http://www.amazon.com
405 MethodNotAllowed
Date: Mon, 13 Aug 2012 18:50:56 GMT
Server: Server
Vary: Accept-Encoding,User-Agent
Allow: POST, GET
...
I can't see a way of solving this other than monkey-patching or otherwise extending URLValidator
to use a GET
or POST
request; before doing so, you should think carefully about whether to use check_exists
at all (without which this problem should go away). As core/validators.py
itself says,
"The
URLField
verify_exists
argument has intractable security and performance issues. Accordingly, it has been deprecated."
You'll find that the in-development version of Django has indeed disposed of this feature completely.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With