Validating URLs in Python

Tags:

I've been trying to figure out what the best way to validate a URL is (specifically in Python) but haven't really been able to find an answer. It seems like there isn't one known way to validate a URL, and it depends on what URLs you think you may need to validate. As well, I found it difficult to find an easy to read standard for URL structure. I did find the RFCs 3986 and 3987, but they contain much more than just how it is structured.

Am I missing something, or is there no one standard way to validate a URL?

681

asked Mar 06 '14 23:03

mp94

4 Answers

The original question is a bit old, but you might also want to look at the Validator-Collection library I released a few months back. It includes high-performing regex-based validation of URLs for compliance against the RFC standard. Some details:

Tested against Python 2.7, 3.4, 3.5, 3.6, 3.7, and 3.8
No dependencies on Python 3.x, one conditional dependency in Python 2.x (drop-in replacement for Python 2.x's buggy re module)
Unit tests that cover 100+ different succeeding/failing URL patterns, including non-standard characters and the like. As close to covering the whole spectrum of the RFC standard as I've been able to find.

It's also very easy to use:

from validator_collection import validators, checkers

checkers.is_url('http://www.stackoverflow.com')
# Returns True

checkers.is_url('not a valid url')
# Returns False

value = validators.url('http://www.stackoverflow.com')
# value set to 'http://www.stackoverflow.com'

value = validators.url('not a valid url')
# raises a validator_collection.errors.InvalidURLError (which is a ValueError)

value = validators.url('https://123.12.34.56:1234')
# value set to 'https://123.12.34.56:1234'

value = validators.url('http://10.0.0.1')
# raises a validator_collection.errors.InvalidURLError (which is a ValueError)

value = validators.url('http://10.0.0.1', allow_special_ips = True)
# value set to 'http://10.0.0.1'

In addition, Validator-Collection includes about 60+ other validators, including IP addresses (IPv4 and IPv6), domains, and email addresses as well, so something folks might find useful.

answered Oct 17 '22 11:10

Chris Modzelewski

This looks like it might be a duplicate of How do you validate a URL with a regular expression in Python?

You should be able to use the urlparse library described there.

>>> from urllib.parse import urlparse # python2: from urlparse import urlparse
>>> urlparse('actually not a url')
ParseResult(scheme='', netloc='', path='actually not a url', params='', query='', fragment='')
>>> urlparse('http://google.com')
ParseResult(scheme='http', netloc='google.com', path='', params='', query='', fragment='')

call urlparse on the string you want to check and then make sure that the ParseResult has attributes for scheme and netloc

answered Oct 17 '22 12:10

bgschiller

I would use the validators package. Here is the link to the documentation and installation instructions.

It is just as simple as

import validators
url = 'YOUR URL'
validators.url(url)

It will return true if it is, and false if not.

answered Oct 17 '22 11:10

Tony Hammack

you can also try using urllib.request to validate by passing the URL in the urlopen function and catching the exception for URLError.

from urllib.request import urlopen, URLError

def validate_web_url(url="http://google"):
    try:
        urlopen(url)
        return True
    except URLError:
        return False

This would return False in this case

answered Oct 17 '22 12:10

Hamza

Related questions
                            
                                Associating string representations with an Enum that uses integer values?
                            
                                Pytest - how to skip tests unless you declare an option/flag?
                            
                                Correlation matrix plot with coefficients on one side, scatterplots on another, and distributions on diagonal
                            
                                Should Python unittests be in a separate module?
                            
                                What does "lambda" mean in Python, and what's the simplest way to use it?
                            
                                load python code at runtime
                            
                                python string format suppress/silent keyerror/indexerror [duplicate]
                            
                                Improving Performance of Django ForeignKey Fields in Admin
                            
                                Django admin display multiple fields on the same line
                            
                                Dynamic choices field in Django Models
                            
                                How can I include a python package with Hadoop streaming job?
                            
                                Unicode encoding for filesystem in Mac OS X not correct in Python?
                            
                                how to create a dictionary using two lists in python? [duplicate]
                            
                                Index Error: list index out of range (Python) [duplicate]
                            
                                Python statsmodels OLS: how to save learned model to file
                            
                                python 32-bit and 64-bit integer math with intentional overflow
                            
                                Python - Pymongo Insert and Update Documents
                            
                                Most pythonic way to convert a string to a octal number
                            
                                No module named flask.ext.wtf
                            
                                Scikit classification report - change the format of displayed results

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Validating URLs in Python

Tags:

python

url

url-validation