>>> a=urllib.urlopen('http://www.domain.com/bigvideo.avi')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200
My question is...bigvideo.avi is 500MB. Does my script first download the file, then check it? Or, can it immediately check the error code without saving the file?
Just use Chrome browser. Hit F12 to get developer tools and look at the network tab. Shows you all status codes, whether page was from cache etc.
The HTTP 404 Not Found response status code indicates that the server cannot find the requested resource.
HTTP Status Code 404 - Not Found This means the file or page that the browser is requesting wasn't found by the server. 404s don't indicate whether the missing page or resource is missing permanently or only temporarily. You can see what this looks like on your site by typing in a URL that doesn't exist.
The 200 OK status code means that the request was successful, but the meaning of success depends on the request method used: GET: The requested resource has been fetched and transmitted to the message body.
You want to actually tell the server not to send the full content of the file. HTTP has a mechanism for this called "HEAD" that is an alternative to "GET". It works the same way, but the server only sends you the headers, none of the actual content.
That'll save at least one of you bandwidth, while simply not doing a read() will only not bother getting the full file.
Try this:
import httplib
c = httplib.HTTPConnection(<hostname>)
c.request("HEAD", <url>)
print c.getresponse().status
The status code will be printed. Url should only be a segment, like "/foo" and hostname should be like, "www.example.com".
Yes, it will fetch the file.
I think what you really want to do is send a HTTP HEAD request (which basically asks the server not for the data itself, but for the headers only). you can look here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With