Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Python intern strings?

In Java, explicitly declared Strings are interned by the JVM, so that subsequent declarations of the same String results in two pointers to the same String instance, rather than two separate (but identical) Strings.

For example:

public String baz() {
    String a = "astring";
    return a;
}

public String bar() {
    String b = "astring"
    return b;
}

public void main() {
    String a = baz()
    String b = bar()
    assert(a == b) // passes
}

My question is, does CPython (or any other Python runtime) do the same thing for strings? For example, if I have some class:

class example():
    def __init__():
        self._inst = 'instance' 

And create 10 instances of this class, will each one of them have an instance variable referring to the same string in memory, or will I end up with 10 separate strings?

like image 548
csvan Avatar asked Jul 16 '13 14:07

csvan


People also ask

Does Python use string?

Python has a built-in string class named "str" with many handy features (there is an older module named "string" which you should not use). String literals can be enclosed by either double or single quotes, although single quotes are more commonly used.

What is intern () in string?

The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.

Does Python have a string pool?

Strings are immutable in Python, so the implementation can decide whether to intern (that's a term often associated with C#, meaning that some strings are stored in a pool) strings or not. In your example, you're dynamically creating strings.

What does string () do in Python?

String is a collection of alphabets, words or other characters. It is one of the primitive data structures and are the building blocks for data manipulation. Python has a built-in string class named str . Python strings are "immutable" which means they cannot be changed after they are created.


1 Answers

This is called interning, and yes, Python does do this to some extent, for shorter strings created as string literals. See About the changing id of an immutable string for some discussion.

Interning is runtime dependent, there is no standard for it. Interning is always a trade-off between memory use and the cost of checking if you are creating the same string. There is the sys.intern() function to force the issue if you are so inclined, which documents some of the interning Python does for you automatically:

Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys.

Note that Python 2 the intern() function used to be a built-in, no import necessary.

like image 128
Martijn Pieters Avatar answered Oct 21 '22 20:10

Martijn Pieters