Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert []int8 to string

What's the best way (fastest performance) to convert from []int8 to string?

For []byte we could do string(byteslice), but for []int8 it gives an error:

cannot convert ba (type []int8) to type string

I got the ba from SliceScan() method of *sqlx.Rows that produces []int8 instead of string

Is this solution the fastest?

func B2S(bs []int8) string {
    ba := []byte{}
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

EDIT my bad, it's uint8 instead of int8.. so I can do string(ba) directly.

like image 216
Kokizzu Avatar asked Mar 04 '15 06:03

Kokizzu


2 Answers

Note beforehand: The asker first stated that input slice is []int8 so that is what the answer is for. Later he realized the input is []uint8 which can be directly converted to string because byte is an alias for uint8 (and []byte => string conversion is supported by the language spec).


You can't convert slices of different types, you have to do it manually.

Question is what type of slice should we convert to? We have 2 candidates: []byte and []rune. Strings are stored as UTF-8 encoded byte sequences internally ([]byte), and a string can also be converted to a slice of runes. The language supports converting both of these types ([]byte and []rune) to string.

A rune is a unicode codepoint. And if we try to convert an int8 to a rune in a one-to-one fashion, it will fail (meaning wrong output) if the input contains characters which are encoded to multiple bytes (using UTF-8) because in this case multiple int8 values should end up in one rune.

Let's start from the string "世界" whose bytes are:

fmt.Println([]byte("世界"))
// Output: [228 184 150 231 149 140]

And its runes:

fmt.Println([]rune("世界"))
// [19990 30028]

It's only 2 runes and 6 bytes. So obviously 1-to-1 int8->rune mapping won't work, we have to go with 1-1 int8->byte mapping.

byte is alias for uint8 having range 0..255, to convert it to []int8 (having range -128..127) we have to use -256+bytevalue if the byte value is > 127 so the "世界" string in []int8 looks like this:

[-28 -72 -106 -25 -107 -116]

The backward conversion what we want is: bytevalue = 256 + int8value if the int8 is negative but we can't do this as int8 (range -128..127) and neither as byte (range 0..255) so we also have to convert it to int first (and back to byte at the end). This could look something like this:

if v < 0 {
    b[i] = byte(256 + int(v))
} else {
    b[i] = byte(v)
}

But actually since signed integers are represented using 2's complement, we get the same result if we simply use a byte(v) conversion (which in case of negative numbers this is equivalent to 256 + v).

Note: Since we know the length of the slice, it is much faster to allocate a slice with this length and just set its elements using indexing [] and not calling the built-in append function.

So here is the final conversion:

func B2S(bs []int8) string {
    b := make([]byte, len(bs))
    for i, v := range bs {
        b[i] = byte(v)
    }
    return string(b)
}

Try it on the Go Playground.

like image 51
icza Avatar answered Oct 24 '22 17:10

icza


Not entirely sure it is the fastest, but I haven't found anything better. Change ba := []byte{} for ba := make([]byte,0, len(bs) so at the end you have:

func B2S(bs []int8) string {
    ba := make([]byte,0, len(bs))
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

This way the append function will never try to insert more data that it can fit in the slice's underlying array and you will avoid unnecessary copying to a bigger array.

like image 29
Topo Avatar answered Oct 24 '22 18:10

Topo