Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Send a non-ASCII POST request in Python?

I'm trying to send a POST request to a web app. I'm using the mechanize module (itself a wrapper of urllib2). Anyway, when I try to send a POST request, I get UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128). I tried putting the unicode(string), the unicode(string, encoding="utf-8"), unicode(string).encode() etc, nothing worked - either returned the error above, or the TypeError: decoding Unicode is not supported

I looked at the other SO answers to similar questions, but none helped.

Thanks in advance!

EDIT: Example that produces an error:

prda = "šđćč" #valid UTF-8 characters
prda # typing in python shell 
'\xc5\xa1\xc4\x91\xc4\x87\xc4\x8d'
print prda # in shell
šđćč
prda.encode("utf-8") #in shell
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
unicode(prda)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
like image 494
Bo Milanovich Avatar asked Jan 07 '12 23:01

Bo Milanovich


People also ask

How do I allow non-ascii characters in Python?

In order to use non-ASCII characters, Python requires explicit encoding and decoding of strings into Unicode. In IBM® SPSS® Modeler, Python scripts are assumed to be encoded in UTF-8, which is a standard Unicode encoding that supports non-ASCII characters.

What is non-ASCII data?

Non-ASCII characters are those that are not encoded in ASCII, such as Unicode, EBCDIC, etc. ASCII is limited to 128 characters and was initially developed for the English language.

What is a POST request in Python?

Understanding the Python requests POST Function An HTTP POST request is used to send data to a server, where data are shared via the body of a request. In the request. post() function, data are sent with the data parameter, which accepts a dictionary, a list of tuples, bytes or a file object.

Does Python recognize ASCII?

The built-in Python string module includes several constants that categorize ASCII text. You'll use these string constants to identify character sets, such as string. ascii_letters , string.


1 Answers

I assume you're using Python 2.x.

Given a unicode object:

myUnicode = u'\u4f60\u597d'

encode it using utf-8:

mystr = myUnicode.encode('utf-8')

Note that you need to specify the encoding explicitly. By default it'll (usually) use ascii.

like image 129
Laurence Gonsalves Avatar answered Nov 01 '22 14:11

Laurence Gonsalves