I've been searching all around for a Python 3.x code sample to get HTTP Header information.
Something as simple as get_headers equivalent in PHP cannot be found in Python easily. Or maybe I am not sure how to best wrap my head around it.
In essence, I would like to code something where I can see whether a URL exists or not
something in the line of
h = get_headers(url)
if(h[0] == 200)
{
print("Bingo!")
}
So far, I tried
h = http.client.HTTPResponse('http://docs.python.org/')
But always got an error
h = get_headers(url) if(h[0] == 200) { print("Bingo!") }
To pass HTTP headers into a GET request using the Python requests library, you can use the headers= parameter in the . get() function. The parameter accepts a Python dictionary of key-value pairs, where the key represents the header type and the value is the header value.
columns to print column names in Python. We can use pandas. dataframe. columns variable to print the column tags or headers at ease.
To get an HTTP response code in python-3.x, use the urllib.request
module:
>>> import urllib.request
>>> response = urllib.request.urlopen(url)
>>> response.getcode()
200
>>> if response.getcode() == 200:
... print('Bingo')
...
Bingo
The returned HTTPResponse
Object will give you access to all of the headers, as well. For example:
>>> response.getheader('Server')
'Apache/2.2.16 (Debian)'
If the call to urllib.request.urlopen()
fails, an HTTPError
Exception
is raised. You can handle this to get the response code:
import urllib.request
try:
response = urllib.request.urlopen(url)
if response.getcode() == 200:
print('Bingo')
else:
print('The response code was not 200, but: {}'.format(
response.get_code()))
except urllib.error.HTTPError as e:
print('''An error occurred: {}
The response code was {}'''.format(e, e.getcode()))
urllib, urllib2 or httplib can be used here. However note, urllib and urllib2 uses httplib. Therefore, depending on whether you plan to do this check a lot (1000s of times), it would be better to use httplib. Additional documentation and examples are here.
Example code:
import httplib
try:
h = httplib.HTTPConnection("www.google.com")
h.connect()
except Exception as ex:
print "Could not connect to page."
A similar story to urllib (or urllib2) and httplib from Python 2.x applies to the urllib2 and http.client libraries in Python 3.x. Again, http.client should be quicker. For more documentation and examples look here.
Example code:
import http.client
try:
conn = http.client.HTTPConnection("www.google.com")
conn.connect()
except Exception as ex:
print("Could not connect to page.")
and if you wanted to check the status codes you would need to replace
conn.connect()
with
conn.request("GET", "/index.html") # Could also use "HEAD" instead of "GET".
res = conn.getresponse()
if res.status == 200 or res.status == 302: # Specify codes here.
print("Page Found!")
Note, in both examples, if you would like to catch the specific exception relating to when the URL doesn't exist, rather than all of them, catch the socket.gaierror exception instead (see the socket documentation).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With