Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Golang - ToUpper() on a single byte?

I have a []byte, b, and I want to select a single byte, b[pos] and change it too upper case (and then lower case) The bytes type has a method called ToUpper(). How can I use this for a single byte?

Calling ToUpper on single Byte

OneOfOne gave the most efficient (when calling thousands of times), I use

val = byte(unicode.ToUpper(rune(b[pos])))

in order to find the byte and change the value

b[pos] = val

Checking if byte is Upper

Sometimes, instead of changing the case of a byte, I want to check if a byte is upper or lower case; All the upper case roman-alphabet bytes are lower than the value of the lower case bytes.

func (b Board) isUpper(x int) bool {
    return b.board[x] < []byte{0x5a}[0]
}
like image 528
Nevermore Avatar asked Jul 02 '16 16:07

Nevermore


3 Answers

For a single byte/rune, you can use unicode.ToUpper.

b[pos] = byte(unicode.ToUpper(rune(b[pos])))
like image 151
OneOfOne Avatar answered Nov 15 '22 10:11

OneOfOne


I want to remind OP that bytes.ToUpper() operates on unicode code points encoded using UTF-8 in a byte slice while unicode.ToUpper() operates on a single unicode code point.

By asking to convert a single byte to upper case, OP is implying that the "b" byte slice contains something other than UTF-8, perhaps ASCII-7 or some 8-bit encoding such as ISO Latin-1 (e.g.). In that case OP needs to write an ISO Latin-1 (e.g.) ToUpper() function or OP must convert the ISO Latin-1 (e.g.) bytes to UTF-8 or unicode before using the bytes.ToUpper() or unicode.ToUpper() function.

Anything less creates a pending bug. Neither of the previously mentioned functions will properly convert all possible ISO Latin-1 (e.g.) encoded characters to upper case.

like image 42
Doug Henderson Avatar answered Nov 15 '22 09:11

Doug Henderson


Use the following code to test if an element of the board is an ASCII uppercase letter:

func (b Board) isUpper(x int) bool {
    v := b.board[x]
    return 'A' <= v && v <= 'Z'
}

If the application only needs to distinguish between upper and lowercase letters, then there's no need for the lower bound test:

func (b Board) isUpper(x int) bool {
    return b.board[x] <= 'Z'
}

The code in this answer improves on the code in the question in a few ways:

  • The code in the answer returns the correct value for a board element containing 'Z' (run playground example below for demonstration).
  • 'Z' and 0x85 are the same value, but the code is easier to understand with 'Z'.
  • It's simpler to compare directly with the value 'Z'. No need to create a slice.

playground example

Edit: Revamped answer based on new information in the question since time of my original answer.

like image 2
Bayta Darell Avatar answered Nov 15 '22 10:11

Bayta Darell