I'm trying to parse the result of a HEAD request done using the Python Requests library, but can't seem to access the response content. According to the docs, I should be able to access the content from requests.Response.text. This works fine for me on GET requests, but returns None on HEAD requests. GET request (works) <pre class="prettyprint"><code>import requests response = requests.get(url) content = response.text </code></pre> content = <code><html>...</html></code> HEAD request (no content) <pre class="prettyprint"><code>import requests response = requests.head(url) content = response.text </code></pre> content = <code>None</code> <hr> EDIT OK I've quickly realized form the answers that the HEAD request is not supposed to return content- only headers. But does that mean that, to access things found IN the <code><head></code> tag of a page, like <code><link></code> and <code><meta></code> tags, that one must GET the whole document?

By definition, the responses to HEAD requests do not contain a message-body. Send a GET request if you want to, well, get a response body. Send a HEAD request iff you are only interested in the response status code and headers. HTTP transfers arbitrary content; the HTTP term header is completely unrelated to an HTML <code><head></code>. However, HTTP can be advised to download only a part of the document. If you know the length of the HTML <code><head></code> code (or an upper boundary therefor), you can include an HTTP Range header in your request that advises the remote server to only return a certain number of bytes. If the remote server supports HTTP ranges, it will then serve the reduced answer.

A HEAD doesn't have any content! Try <code>response.headers</code> - that's probably where the action is. An HTTP HEAD request doesn't get the <code><head></code> element of the HTML response you would get from a GET request. I think that's your mistake.

Getting HEAD content with Python Requests

Tags:

I'm trying to parse the result of a HEAD request done using the Python Requests library, but can't seem to access the response content.

According to the docs, I should be able to access the content from requests.Response.text. This works fine for me on GET requests, but returns None on HEAD requests.

GET request (works)

import requests response = requests.get(url) content = response.text

content = <html>...</html>

HEAD request (no content)

import requests response = requests.head(url) content = response.text

content = None

EDIT

OK I've quickly realized form the answers that the HEAD request is not supposed to return content- only headers. But does that mean that, to access things found IN the <head> tag of a page, like <link> and <meta> tags, that one must GET the whole document?

738

asked Mar 04 '12 12:03

Yarin

2 Answers

By definition, the responses to HEAD requests do not contain a message-body.

Send a GET request if you want to, well, get a response body. Send a HEAD request iff you are only interested in the response status code and headers.

HTTP transfers arbitrary content; the HTTP term header is completely unrelated to an HTML <head>. However, HTTP can be advised to download only a part of the document. If you know the length of the HTML <head> code (or an upper boundary therefor), you can include an HTTP Range header in your request that advises the remote server to only return a certain number of bytes. If the remote server supports HTTP ranges, it will then serve the reduced answer.

answered Sep 20 '22 01:09

phihag

A HEAD doesn't have any content! Try response.headers - that's probably where the action is. An HTTP HEAD request doesn't get the <head> element of the HTML response you would get from a GET request. I think that's your mistake.

answered Sep 20 '22 01:09

Spacedman

Related questions
                            
                                How to get Cmd-left/right working with iTerm2 and Vim (without requiring .vimrc changes)?
                            
                                IOError: [Errno 2] No such file or directory trying to open a file
                            
                                What is HTTP Status code 000?
                            
                                How to get all arguments of a callback function
                            
                                cURL and etag usage?
                            
                                "The language attribute on the script element is obsolete. You can safely omit it."?
                            
                                Can one call a method from another model in a model in CodeIgniter?
                            
                                How to see progress of running SQL stored procedures?
                            
                                In C# how to collect stack trace of program crash
                            
                                Use outer instead of expand.grid
                            
                                What causes "Template is not defined" in Meteor?
                            
                                My app is not available for tablet device on Google play

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With