Iterating through a unicode string in Python

Tags:

I've got an issue with iterating through unicode strings, character by character, with python.

print "w: ",word
for c in word:
    print "word: ",c

This is my output

w:  文本
word:  ? 
word:  ?
word:  ?
word:  ?
word:  ?
word:  ?

My desired output is:

文
本

When I use len(word) I get 6. Apparently each character is 3 unicode chunks.

So, my unicode string is successfully stored in the variable, but I cannot get the characters out. I have tried using encode('utf-8'), decode('utf-8) and codecs but still cannot obtain any good results. This seems like a simple problem but is frustratingly hard for me.

Hope someone can point me to the right direction.

Thanks!

931

asked Jun 22 '15 03:06

charpi

2 Answers

# -*- coding: utf-8 -*-
word = "文本"
print(word)
for each in unicode(word,"utf-8"):
    print(each)

Output:

文本
文
本

answered Oct 19 '22 23:10

Pruthvi Raj

The code I used which works is this

fileContent = codecs.open('fileName.txt','r',encoding='utf-8')
#...split by whitespace to get words..
for c in word:
        print(c.encode('utf-8'))

answered Oct 20 '22 01:10

charpi

Related questions
                            
                                generate image -> embed in flask with a data uri
                            
                                python time measure for every function [duplicate]
                            
                                Python matplotlib.stem plot with no markers
                            
                                How can I get the matplotlib rgb color, given the colormap name, BoundryNorm, and 'c='?
                            
                                Interpreting scipy.stats.entropy values
                            
                                ttk.Treeview - Can't change row height
                            
                                Python: ImportError: /usr/local/lib/python2.7/lib-dynload/_io.so: undefined symbol: PyUnicodeUCS2_Replace
                            
                                In Python, why does a negative number raised to an even power remain negative? [duplicate]
                            
                                Using WN-Affect to detect emotion/mood of a string
                            
                                Maybe monad in Python with method chaining
                            
                                Django UnitTest with Mock
                            
                                Run python behave from python instead of command line
                            
                                How to generate a valid sample token with stripe?
                            
                                How do I configure mathjax for iPython notebooks?
                            
                                Numpy: Filtering rows by multiple conditions?
                            
                                How to verify a JWT using python PyJWT with a public PEM cert?
                            
                                How to add a screenshot to allure report with python?
                            
                                Continue until all iterators are done Python
                            
                                numpy: fill offset diagonal with different values
                            
                                Concatenate several np arrays in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Iterating through a unicode string in Python

Tags:

python

unicode

python-2.x

charpi

People also ask

2 Answers

Pruthvi Raj

charpi

Recent Activity

Donate For Us