Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to loop through a UTF-8 string in Go?

Tags:

go

I have a string in Chinese:

x = "你好"

I'd like to loop through it and do something with each character in it, something like:

for i, len := 0, len(x); i < len; i++ {
    foo( x[i] ) // do sth.
}

I found that len(x) would return 6 instead of 2, after Google I found the method RuneCountInString which would return the real length of the string, but I still don't know how to loop to make x[i] get the right character, x[0] == '你' for example..

Thanks

like image 730
wong2 Avatar asked Oct 05 '12 05:10

wong2


Video Answer


1 Answers

Use range.

x = "你好"
for _, c := range x {
    // do something with c
}

If you want random-access, you'll need to use code unit indexes rather than character indexes. Fortunately, there is no good reason to need character indexes, so code unit indexes are fine.

Most languages have the exact same problem. For example, Java and C# use UTF-16, which is also a variable-length encoding (but some people pretend it isn't).

See the UTF-8 Manifesto for more information about why Go uses UTF-8.

like image 52
Dietrich Epp Avatar answered Sep 19 '22 07:09

Dietrich Epp