x = ['Some strings.', 1, 2, 3, 'More strings!', 'Fanc\xc3\xbf string!']
y = [i.decode('UTF-8') for i in x]
What's the best way to convert the strings in x to Unicode? Doing a list compression causes an attribute error (AttributeError: 'int' object has no attribute 'decode'
) because int's don't have a decode method.
I could use a for loop with a try? Or I could do some explicit type checking in the list compression, but is type checking in a dynamic language like Python the right approach?
UPDATE:
I would prefer that the int's remain int's. Although this is not a strict requirement. My ideal output would be [u'Some strings.', 1, 2, 3, u'More strings!', u'Fancÿ string!']
.
If you want to keep the integers as they are in the list while just changing the strings to unicode, you can do
x = ['Some strings.', 1, 2, 3, 'More strings!']
y = [i.decode('UTF-8') if isinstance(i, basestring) else i for i in x]
which gets you
[u'Some strings.', 1, 2, 3, u'More strings!']
You could use the unicode function:
>>> x = ['Some strings.', 1, 2, 3, 'More strings!']
>>> y = [unicode(i) for i in x]
>>> y
[u'Some strings.', u'1', u'2', u'3', u'More strings!']
UPDATE: since you specified that you want the integers to remain as-is, I would use this:
>>> y = [unicode(i) if isinstance(i, basestring) else i for i in x]
>>> y
[u'Some strings.', 1, 2, 3, u'More strings!']
Note: as @Boldewyn points out, if you want UTF-8, you should pass the encoding
parameter to the unicode function:
unicode(i, encoding='UTF-8')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With