Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an equivalent to Java's String intern function in Go?

Tags:

java

string

go

Is there an equivalent to Java's String intern function in Go?

I am parsing a lot of text input that has repeating patterns (tags). I would like to be memory efficient about it and store pointers to a single string for each tag, instead of multiple strings for each occurrence of a tag.

like image 315
Malcolm Avatar asked Sep 05 '25 08:09

Malcolm


1 Answers

No such function exists that I know of. However, you can make your own very easily using maps. The string type itself is a uintptr and a length. So, a string assigned from another string takes up only two words. Therefore, all you need to do is ensure that there are no two strings with redundant content.

Here is an example of what I mean.

type Interner map[string]string

func NewInterner() Interner {
    return Interner(make(map[string]string))
}

func (m Interner) Intern(s string) string {
    if ret, ok := m[s]; ok {
        return ret
    }

    m[s] = s
    return s
}

This code will deduplicate redundant strings whenever you do the following:

str = interner.Intern(str)

EDIT: As jnml mentioned, my answer could pin memory depending on the string it is given. There are two ways to solve this problem. Both of these should be inserted before m[s] = s in my previous example. The first copies the string twice, the second uses unsafe. Neither are ideal.

Double copy:

b := []byte(s)
s = string(b)

Unsafe (use at your own risk. Works with current version of gc compiler):

b := []byte(s)
s = *(*string)(unsafe.Pointer(&b))
like image 69
Stephen Weinberg Avatar answered Sep 08 '25 07:09

Stephen Weinberg