Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to include pictures bytes to a JSON with python? (encoding issue)

I would like to include picture bytes into a JSON, but I struggle with a encoding issue:

import urllib
import json

data = urllib.urlopen('https://www.python.org/static/community_logos/python-logo-master-v3-TM-flattened.png').read()
json.dumps({'picture' : data})

UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 0: invalid start byte

I don't know how to deal with that issue since I am handling an image, so I am a bit confused about this encoding issue. I am using python 2.7. Does anyone can help me? :)

like image 669
Thom Avatar asked Jan 08 '15 10:01

Thom


People also ask

How to encode Python objects as JSON formatted data in Python?

The json.dumps () method encodes any Python object into JSON formatted String. The json.dump () and json.dump () is used for following operations Encode Python serialized objects as JSON formatted data.

How do I convert a Python object to JSON?

The json module provides the following two methods to encode Python objects into JSON format. The json.dump () method (without “ s ” in “dump”) used to write Python serialized object as JSON formatted data into a file. The json.dumps () method encodes any Python object into JSON formatted String.

What is JSON data in Python?

JSON (JavaScript Object Notation) is frequently used between a server and a web application. An example of JSON data: The json module enables you to convert between JSON and Python Objects. JSON data can be directly mapped to a Python list.

What is encoding in Python?

Encoding is the process of transforming text or values into an encrypted format that can only be decoded by the intended user aka python json encode. In general, we can't communicate a complex number over JSON; if we do, we get the error YTypeError: Object of type'complex' is not JSON serializable.


1 Answers

JSON data expects to handle Unicode text. Binary image data is not text, so when the json.dumps() function tries to decode the bytestring to unicode using UTF-8 (the default) that decoding fails.

You'll have to wrap your binary data in a text-safe encoding first, such as Base-64:

json.dumps({'picture' : data.encode('base64')})

Of course, this then assumes that the receiver expects your data to be wrapped so.

If your API endpoint has been so badly designed to expect your image bytes to be passed in as text, then the alternative is to pretend that your bytes are really text; if you first decode it as Latin-1 you can map those bytes straight to Unicode codepoints:

json.dumps({'picture' : data.encode('latin-1')})

With the data already a unicode object the json library will then proceed to treat it as text. This does mean that it can replace non-ASCII codepoints with \uhhhh escapes.

like image 78
Martijn Pieters Avatar answered Sep 25 '22 03:09

Martijn Pieters