Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid Unicode code point 0xd83f

Tags:

unicode

go

I'm trying to port some Java to Go. The Java code has a character variable with the value '\ud83f'. When I try to use this value in Go, it doesn't compile:

package main
func main() {
    c := '\ud83f'
    println(c)
}

$ go run a.go
# command-line-arguments
./a.go:3: invalid Unicode code point in escape sequence: 0xd83f

Why? I also tried making a string with that value in Python and it worked too. It's just not working in Go for some reason.

like image 967
Dog Avatar asked May 15 '26 04:05

Dog


2 Answers

That rune literal you tried to use is invalid because it denotes a surrogate code point. The spec says rune literals cannot denote a surrogate code point ("as well as others" (which?)):

Rune Literals

[...]

The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves.

Further below in the examples, you can see another case which is deemed illegal:

'\U00110000' // illegal: invalid Unicode code point

Which seems to imply that invalid code points (such as those above 10ffff) are also illegal in rune literals.

Note that since rune is merely an alias for int32, you can simply do:

var r rune = 0xd8f3

instead of

var r rune = '\ud8f3'

And if you wanted to get a number above 10FFFF you could do

var r rune = 0x11ffff

instead of

var r rune = '\U0011ffff'
like image 67
Harold R. Eason Avatar answered May 18 '26 07:05

Harold R. Eason


Already being mentioned, \ud83f is part of a surrogate half, used in UTF-16 encoding. This is not considered a valid code point, and the Go specification explicitly states:

The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves.

If you want a rune with this invalid code point, you can do the following:

c := rune(0xd83f)

But, the correct way to handling such a value is to first decode the two surrogate halves, then using the resulting valid code point.

like image 25
ANisus Avatar answered May 18 '26 06:05

ANisus



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!