how can i determine if anything at the given url does exist in the web using python? it can be a html page or a pdf file, shouldnt be matter. ive tried the solution written in this page http://code.activestate.com/recipes/101276/ but it just returns a 1 when its a pdf file or anything.
You need to check HTTP response code. Python example:
from urllib2 import urlopen
code = urlopen("http://example.com/").code
4xx and 5xx code probably mean that you cannot get anything from this URL. 4xx status codes describe client errors (like "404 Not found") and 5xx status codes describe server errors (like "500 Internal server error"):
if (code / 100 >= 4):
print "Nothing there."
Links:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With