Python script to see if a web page exists without downloading the whole page?

Tags:

I'm trying to write a script to test for the existence of a web page, would be nice if it would check without downloading the whole page.

This is my jumping off point, I've seen multiple examples use httplib in the same way, however, every site I check simply returns false.

import httplib
from httplib import HTTP
from urlparse import urlparse

def checkUrl(url):
    p = urlparse(url)
    h = HTTP(p[1])
    h.putrequest('HEAD', p[2])
    h.endheaders()
    return h.getreply()[0] == httplib.OK

if __name__=="__main__":
    print checkUrl("http://www.stackoverflow.com") # True
    print checkUrl("http://stackoverflow.com/notarealpage.html") # False

Any ideas?

Edit

Someone suggested this, but their post was deleted.. does urllib2 avoid downloading the whole page?

import urllib2

try:
    urllib2.urlopen(some_url)
    return True
except urllib2.URLError:
    return False

951

asked Jun 24 '11 17:06

2 Answers

how about this:

import httplib
from urlparse import urlparse

def checkUrl(url):
    p = urlparse(url)
    conn = httplib.HTTPConnection(p.netloc)
    conn.request('HEAD', p.path)
    resp = conn.getresponse()
    return resp.status < 400

if __name__ == '__main__':
    print checkUrl('http://www.stackoverflow.com') # True
    print checkUrl('http://stackoverflow.com/notarealpage.html') # False

this will send an HTTP HEAD request and return True if the response status code is < 400.

notice that StackOverflow's root path returns a redirect (301), not a 200 OK.

126

answered Oct 18 '22 19:10

Corey Goldberg

Using requests, this is as simple as:

import requests

ret = requests.head('http://www.example.com')
print(ret.status_code)

This just loads the website's header. To test if this was successfull, you can check the results status_code. Or use the raise_for_status method which raises an Exception if the connection was not succesfull.

answered Oct 18 '22 19:10

MaxNoe

Related questions
                            
                                Django Test Run Environment error: no enough space left on disk
                            
                                calculate mod using pow function python
                            
                                How can I remove everything in a string until a character(s) are seen in Python
                            
                                Custom Frame Duration for Animated Gif in Python ImageIO
                            
                                How to get date after subtracting days in pandas
                            
                                Return the maximum value from a dictionary [duplicate]
                            
                                How can I create the fibonacci series using a list comprehension?
                            
                                Delete all versions of an object in S3 using python?
                            
                                Problems upgrading Ipython (prompt_toolkit incompatibilities)
                            
                                flattening nested Json in pandas data frame
                            
                                how to programmatically determine available GPU memory with tensorflow?
                            
                                Set the legend location of a pandas plot
                            
                                no module named pkg_resources.py2_warn pyinstaller
                            
                                Python Class Inheritance issue
                            
                                Cython and numpy speed
                            
                                Why does my python script randomly get killed?
                            
                                How should I check that a given argument is a datetime.date object?
                            
                                Can I get Python debugger pdb to output with Color?
                            
                                Picking a Random Word from a list in python?
                            
                                Cryptography tools for python 3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python script to see if a web page exists without downloading the whole page?

Tags:

python

httplib

urlparse

some1

People also ask

2 Answers

Corey Goldberg

MaxNoe

Recent Activity

Donate For Us