Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop over string returns int32

Tags:

loops

range

go

Why does ranging over a string return int32 value instead of the original character in go unlike other languages?

for example:

func main() {

    var s string
    s = "Hello"
    for _, v := range s {
        fmt.Println(v)
    }

}

Returns:

72
101
108
108
111

Should we use conversion like below to get the original character?

func main() {

    var s string
    s = "Hello"
    for _, v := range s {
        fmt.Println(string(v))
    }

}
like image 920
shan kulkarni Avatar asked Nov 19 '25 12:11

shan kulkarni


1 Answers

The Go Programming Language Specification

For statements

For statements with range clause

For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be 0xFFFD, the Unicode replacement character, and the next iteration will advance a single byte in the string.


In Go, a character is a Unicode code point, a Go type rune (alias of int32). Go strings are used to store Unicode code points in UTF-8 encoded form.


The Go Programming Language Specification

Conversions

Conversions to and from a string type

Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD".

string('a')       // "a"
string(-1)        // "\ufffd" == "\xef\xbf\xbd"
string(0xf8)      // "\u00f8" == "ø" == "\xc3\xb8"
type MyString string
MyString(0x65e5)  // "\u65e5" == "日" == "\xe6\x97\xa5"

For example,

package main

import (
    "fmt"
)

func main() {
    helloworld := "Hello, 世界"
    fmt.Println(helloworld)
    for i, r := range helloworld {
        fmt.Println(i, r, string(r))
    }
}

Playground: https://play.golang.org/p/R5sBeGiJzR4

Output:

Hello, 世界
0 72 H
1 101 e
2 108 l
3 108 l
4 111 o
5 44 ,
6 32  
7 19990 世
10 30028 界

References:

The Go Blog: Strings, bytes, runes and characters in Go

The Unicode Consortium

like image 154
peterSO Avatar answered Nov 21 '25 10:11

peterSO



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!