Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Golang converting from rune to string

I have the following code, it is supposed to cast a rune into a string and print it. However, I am getting undefined characters when it is printed. I am unable to figure out where the bug is:

package main  import (     "fmt"     "strconv"     "strings"     "text/scanner" )  func main() {     var b scanner.Scanner     const a = `a`     b.Init(strings.NewReader(a))     c := b.Scan()     fmt.Println(strconv.QuoteRune(c)) } 
like image 285
user3551708 Avatar asked Aug 31 '16 09:08

user3551708


People also ask

What is [] rune Golang?

It represents a Rune constant, where an integer value recognizes a Unicode code point. In Go language, a Rune Literal is expressed as one or more characters enclosed in single quotes like 'g', '\t', etc. In between single quotes, you are allowed to place any character except a newline and an unescaped single quote.

What is rune slice?

When you convert a string to a rune slice, you get a new slice that contains the Unicode code points (runes) of the string. For an invalid UTF-8 sequence, the rune value will be 0xFFFD for each invalid byte.

What are Golang code points?

Code points, characters, and runes The Unicode standard uses the term “code point” to refer to the item represented by a single value. The code point U+2318, with hexadecimal value 2318, represents the symbol ⌘.


1 Answers

That's because you used Scanner.Scan() to read a rune but it does something else. Scanner.Scan() can be used to read tokens or runes of special tokens controlled by the Scanner.Mode bitmask, and it returns special constants form the text/scanner package, not the read rune itself.

To read a single rune use Scanner.Next() instead:

c := b.Next() fmt.Println(c, string(c), strconv.QuoteRune(c)) 

Output:

97 a 'a' 

If you just want to convert a single rune to string, use a simple type conversion. rune is alias for int32, and converting integer numbers to string:

Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer.

So:

r := rune('a') fmt.Println(r, string(r)) 

Outputs:

97 a 

Also to loop over the runes of a string value, you can simply use the for ... range construct:

for i, r := range "abc" {     fmt.Printf("%d - %c (%v)\n", i, r, r) } 

Output:

0 - a (97) 1 - b (98) 2 - c (99) 

Or you can simply convert a string value to []rune:

fmt.Println([]rune("abc")) // Output: [97 98 99] 

There is also utf8.DecodeRuneInString().

Try the examples on the Go Playground.

Note:

Your original code (using Scanner.Scan()) works like this:

  1. You called Scanner.Init() which sets the Mode (b.Mode) to scanner.GoTokens.
  2. Calling Scanner.Scan() on the input (from "a") returns scanner.Ident because "a" is a valid Go identifier:

    c := b.Scan() if c == scanner.Ident {     fmt.Println("Identifier:", b.TokenText()) }  // Output: "Identifier: a" 
like image 75
icza Avatar answered Sep 20 '22 20:09

icza