Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java string interning, what is guaranteed?

The question boils down to this code:

// setup
String str1 = "some string";
String str2 = new String(str1);
assert str1.equals(str2);
assert str1 != str2;
String str3 = str2.intern();

// question cases
boolean case1 = str1 == "some string";
boolean case2 = str1 == str3;

Does Java standard give any guarantees about values of case1 and case2? Link to relevant part of Java spec would be nice, of course.

Yes, I looked at all the "Similar Questions" found by SO, and found no duplicates, as none I found answered the question this way. And no, this is not about the misguided idea of "optimizing" string comparisons by replacing equals with ==.

like image 524
hyde Avatar asked Jan 23 '13 07:01

hyde


People also ask

Is string intern thread safe?

The short answer to your question is yes. It's thread-safe.

What does the string intern () method do?

String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.

Does Java automatically intern strings?

The distinct values are stored in a string intern pool. The single copy of each string is called its intern and is typically looked up by a method of the string class, for example String. intern() in Java. All compile-time constant strings in Java are automatically interned using this method.

What is an interned String in Java?

Core Java bootcamp program with Hands on practice String interning is a process wherein a single copy of every distinct string value is stored. In addition to this, the strings can't be changed too. This way, strings can contain the same data as well as share the same memory.


2 Answers

I think String.intern API provides enough information

A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of the The Java™ Language Specification.

like image 182
Evgeniy Dorofeev Avatar answered Oct 05 '22 23:10

Evgeniy Dorofeev


Here is your JLS quote, Section 3.10.5:

Each string literal is a reference (§4.3) to an instance (§4.3.1, §12.5) of class String (§4.3.3). String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions (§15.28)-are "interned" so as to share unique instances, using the method String.intern.

Thus, the test program consisting of the compilation unit (§7.3):

package testPackage;
class Test {
        public static void main(String[] args) {
                String hello = "Hello", lo = "lo";
                System.out.print((hello == "Hello") + " ");
                System.out.print((Other.hello == hello) + " ");
                System.out.print((other.Other.hello == hello) + " ");
                System.out.print((hello == ("Hel"+"lo")) + " ");
                System.out.print((hello == ("Hel"+lo)) + " ");
                System.out.println(hello == ("Hel"+lo).intern());
        }
}

class Other { static String hello = "Hello"; }

and the compilation unit:

package other;

public class Other { static String hello = "Hello"; }

produces the output: true true true true false true

This example illustrates six points:

Literal strings within the same class (§8) in the same package (§7) represent references to the same String object (§4.3.1).

Literal strings within different classes in the same package represent references to the same String object.

Literal strings within different classes in different packages likewise represent references to the same String object.

Strings computed by constant expressions (§15.28) are computed at compile time and then treated as if they were literals.

Strings computed by concatenation at run time are newly created and therefore distinct. The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.

Combined with the JavaDoc for intern, and you have enough information to deduce that both of your cases will return true.

like image 23
Perception Avatar answered Oct 06 '22 01:10

Perception