Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python : UnicodeEncodeError when I use grep

I am using a simple python script to get reservation results for my CID : simple.py:

data = {"minorRev":"current minorRev #","cid":"xxx","apiKey":"xxx","customerIpAddress":"  ","creationDateStart":"03/31/2013","}

url = 'http://someservice/services/rs/'                      
req = requests.get(url,params=data)                        
print req                                                                 
print req.text                                                                
print req.status_code

Now on the command prompt if I do python simple.py it runs perfectly and prints the req.text variable

However when I try to do

python simple.py | grep pattern

I get

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1314: ordinal not in range(128)
like image 627
Deepankar Bajpeyi Avatar asked Apr 01 '13 06:04

Deepankar Bajpeyi


2 Answers

print needs to encode the string before sending to stdout but when the process is in a pipe, the value of sys.stdout.encoding is None, so print receives an unicode object and then it tries to encode this object using the ascii codec -- if you have non-ASCII characters in this unicode object, an exception will be raised.

You can solve this problem encoding all unicode objects before sending it to the standard output (but you'll need to guess which codec to use). See these examples:

File wrong.py:

# coding: utf-8

print u'Álvaro'

Result:

alvaro@ideas:/tmp
$ python wrong.py 
Álvaro
alvaro@ideas:/tmp
$ python wrong.py | grep a
Traceback (most recent call last):
  File "wrong.py", line 3, in <module>
    print u'Álvaro'
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in position 0: ordinal not in range(128)

File right.py:

# coding: utf-8

print u'Álvaro'.encode('utf-8')
# unicode object encoded == `str` in Python 2

Result:

alvaro@ideas:/tmp
$ python right.py 
Álvaro
alvaro@ideas:/tmp
$ python right.py | grep a
Álvaro
like image 52
Álvaro Justen Avatar answered Sep 23 '22 04:09

Álvaro Justen


If sys.stdout.isatty() is false (the output is redirected to a file/pipe) then configure PYTHONIOENCODING envvar outside your script. Always print Unicode, don't hardcode the character encoding of your environment inside your script:

$ PYTHONIOENCODING=utf-8 python simple.py | grep pattern
like image 24
jfs Avatar answered Sep 21 '22 04:09

jfs