Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Python, how do I convert a list of ints and strings to Unicode?

Tags:

python

unicode

x = ['Some strings.', 1, 2, 3, 'More strings!', 'Fanc\xc3\xbf string!']
y = [i.decode('UTF-8') for i in x]

What's the best way to convert the strings in x to Unicode? Doing a list compression causes an attribute error (AttributeError: 'int' object has no attribute 'decode') because int's don't have a decode method.

I could use a for loop with a try? Or I could do some explicit type checking in the list compression, but is type checking in a dynamic language like Python the right approach?

UPDATE:

I would prefer that the int's remain int's. Although this is not a strict requirement. My ideal output would be [u'Some strings.', 1, 2, 3, u'More strings!', u'Fancÿ string!'].

like image 292
Buttons840 Avatar asked Mar 05 '12 17:03

Buttons840


2 Answers

If you want to keep the integers as they are in the list while just changing the strings to unicode, you can do

x = ['Some strings.', 1, 2, 3, 'More strings!']
y = [i.decode('UTF-8') if isinstance(i, basestring) else i for i in x]

which gets you

[u'Some strings.', 1, 2, 3, u'More strings!']
like image 63
cjm Avatar answered Oct 14 '22 04:10

cjm


You could use the unicode function:

>>> x = ['Some strings.', 1, 2, 3, 'More strings!']
>>> y = [unicode(i) for i in x]
>>> y
[u'Some strings.', u'1', u'2', u'3', u'More strings!']

UPDATE: since you specified that you want the integers to remain as-is, I would use this:

>>> y = [unicode(i) if isinstance(i, basestring) else i for i in x]
>>> y
[u'Some strings.', 1, 2, 3, u'More strings!']

Note: as @Boldewyn points out, if you want UTF-8, you should pass the encoding parameter to the unicode function:

unicode(i, encoding='UTF-8')
like image 37
jterrace Avatar answered Oct 14 '22 04:10

jterrace