Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting HEAD content with Python Requests

Tags:

I'm trying to parse the result of a HEAD request done using the Python Requests library, but can't seem to access the response content.

According to the docs, I should be able to access the content from requests.Response.text. This works fine for me on GET requests, but returns None on HEAD requests.

GET request (works)

import requests response = requests.get(url) content = response.text 

content = <html>...</html>

HEAD request (no content)

import requests response = requests.head(url) content = response.text 

content = None


EDIT

OK I've quickly realized form the answers that the HEAD request is not supposed to return content- only headers. But does that mean that, to access things found IN the <head> tag of a page, like <link> and <meta> tags, that one must GET the whole document?

like image 738
Yarin Avatar asked Mar 04 '12 12:03

Yarin


People also ask

How do I pass a header in GET request?

For example, to send a GET request with a custom header name, you can use the "X-Real-IP" header, which defines the client's IP address. For a load balancer service, "client" is the last remote host. Your load balancer intercepts traffic between the client and your server.

What is header in requests Python?

HTTP headers let the client and the server pass additional information with an HTTP request or response. All the headers are case-insensitive, headers fields are separated by colon, key-value pairs in clear-text string format.

How do you add a Content-Type to a request header in Python?

To send the Content-Type header using Curl, you need to use the -H command-line option. For example, you can use the -H "Content-Type: application/json" command-line parameter for JSON data. Data is passed to Curl using the -d command-line option.


2 Answers

By definition, the responses to HEAD requests do not contain a message-body.

Send a GET request if you want to, well, get a response body. Send a HEAD request iff you are only interested in the response status code and headers.

HTTP transfers arbitrary content; the HTTP term header is completely unrelated to an HTML <head>. However, HTTP can be advised to download only a part of the document. If you know the length of the HTML <head> code (or an upper boundary therefor), you can include an HTTP Range header in your request that advises the remote server to only return a certain number of bytes. If the remote server supports HTTP ranges, it will then serve the reduced answer.

like image 56
phihag Avatar answered Sep 20 '22 01:09

phihag


A HEAD doesn't have any content! Try response.headers - that's probably where the action is. An HTTP HEAD request doesn't get the <head> element of the HTML response you would get from a GET request. I think that's your mistake.

like image 43
Spacedman Avatar answered Sep 20 '22 01:09

Spacedman