Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: saving non ascii characters to file

Tags:

python

I'm trying to make a function which prints to the command prompt and to a file. I get encoding/decoding errors with the following code:

import os

def pas(stringToProcess): #printAndSave
  print stringToProcess 
  try: f = open('file', 'a')
  except: f = open('file', 'wb')
  print  >> f, stringToProcess
  f.close()

all = {u'title': u'Pi\xf1ata', u'albumname': u'New Clear War {EP}', u'artistname': u'Montgomery'}

pas(all['title'])

I get the following output:

Piñata
Traceback (most recent call last):
  File "new.py", line 17, in <module>
     pas(all['title'])
  File "new.py", line 11, in pas
    print  >> f, stringToProcess
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 2: ordinal not in range(128)

I've tried all the encode()/decode() permutations I can imagine from similar answers on here, without success. How can this error be solved?

like image 945
stretch Avatar asked May 14 '26 22:05

stretch


2 Answers

As someone commented, you probably just need to specify which codec to use when writing the string. E.g., this works for me:

def pas(s):
    print(s)
    with open("file", "at") as f:
        f.write("%s\n" % s.encode("utf-8"))

pas(u'Pi\xf1ata')
pas(u'Pi\xf1ata')

As you can see, I specifically open the file in append/text mode. If the file doesn't exist, it will be created. I also use with instead of your try-except method. This is merely the style I prefer.

As Bhargav says, you can also set the default encoding. It all depends on how much control you need in your program and both ways are fine.

like image 110
csl Avatar answered May 17 '26 11:05

csl


Use sys.setdefaultencoding('utf8') to prevent the error from occuring.

That is

import os,sys
reload(sys)  
sys.setdefaultencoding('utf8')
def pas(stringToProcess): #printAndSave
  print stringToProcess 
  try: f = open('file', 'a')
  except: f = open('file', 'wb')
  print  >> f, stringToProcess
  f.close()

all = {u'title': u'Pi\xf1ata', u'albumname': u'New Clear War {EP}', u'artistname': u'Montgomery'}

pas(all['title'])

This would print

Piñata
like image 41
Bhargav Rao Avatar answered May 17 '26 12:05

Bhargav Rao