Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binary response content, requests lib

Tags:

python

I was reading the documentation on the requests lib and it seems to be tremendously outdated or something.

I was going step by step, trying all the examples shown there and encountered a problem as I tried running the following piece:

import requests
from PIL import Image
from StringIO import StringIO

response = requests.get('http://www.github.com')
i = Image.open(StringIO(response.content))

That piece is from the official documentation. The first error that I got was the ImportError: no module named StringIO

Okay, then I found out that that module no longer exists, and in order to import StringIO one has to write from io import StringIO. I did that. Tried running the code again and this time it errored out with TypeError:initial_value must be str or None, not bytes. What on earth did I do wrong? I don't follow...All I did was try running the code from the official doc....I'm clueless.

EDITED: And yeah...to use PIL one has to install Pillow.

like image 439
Albert Avatar asked Mar 17 '16 16:03

Albert


People also ask

What is binary response content?

Definition. Binary response format is defined as a response format in measurement with only two possible values (e.g., yes or no, true or false).

What does .content do in Python?

content : This attribute returns the raw bytes of the response content. text : The text attribute returns the content as a normal UTF-8 encoded Python string. json() : You can use the json() method to get the response content in JSON format.

What does Response Raise_for_status () do?

raise_for_status() returns an HTTPError object if an error has occurred during the process. It is used for debugging the requests module and is an integral part of Python requests. Python requests are generally used to fetch the content from a particular resource URI.


1 Answers

from what you say, you're running python3 (as the StringIO package has been renamed io in python3, not python2) and your example is python2 (for obvious reasons).

So for your issue:

"TypeError:initial_value must be str or None, not bytes".

What that means is that in:

response = requests.get('http://www.github.com')

you're either getting None or a response in bytes for response.content. Given that your request worked, and you can access response.content, it is very likely to be in bytes.

As the requests library works at a quite low level, and all data coming in and to sockets (including the HTTP socket) is plain binary (i.e. not interpreted), to be able to use the output in string functions you need to convert it into something.

In python3 str is the old unicode from python2, and bytes is close to the old str of python2. So you would need to convert the bytes into a string to feed StringIO:

i = Image.open(StringIO(response.content.decode('utf-8')))

for example. But then I'm expecting Image.open() to yell at you that it does not know wtf it is supposed to do with a unicode buffer, all it really wants is a byte array!

But because Image.open() is actually expecting a stream of bytes, and not a unicode stream, what you shall be doing is actually use a BytesIO instead of a StringIO:

from io import BytesIO
i = Image.open(BytesIO(response.content))

Finally, you're sweet to give an example, but it's not one that would work, as you're giving a link to an HTML page, instead of an image.

HTH

like image 143
zmo Avatar answered Sep 19 '22 14:09

zmo