Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why intern() does not work with literal 'java'?

I have tried below code:

public class TestIntern {
  public static void main(String[] args) {
   char[] c1={'a','b','h','i'};
   String s1 = new String(c1);
   s1.intern();
   String s2="abhi";
   System.out.println(s1==s2);//true

   char[] c2={'j','a','v','a'};
   String sj1 = new String(c2);
   sj1.intern();
   String sj2="java";
   System.out.println(sj1==sj2);//false

   char[] c3={'J','A','V','A'};
   String tj1 = new String(c3);
   tj1.intern();
   String tj2="JAVA";
   System.out.println(tj1==tj2);//true
  }
}

I have tried many different literals.

Could anyone please explain why intern() doesn't work as expected with literal "java"? Why do the above reference comparisons evaluate to true, except when the literal is "java"?

like image 981
Abhishek Kumar Avatar asked Mar 27 '18 21:03

Abhishek Kumar


People also ask

Can we call intern method on literals?

substring(1). intern(),the method of intern() will put the ""! test". substring(1)" to the pool of literal strings,so in this case,they are same reference objects,so will return true.

What does intern () method do in Java?

intern() The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.

What is String intern () When and why should it be used?

String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.

Does Java automatically intern strings?

Though Java automatically interns all Strings by default, remember that we only need to intern strings when they are not constants, and we want to be able to quickly compare them to other interned strings.


3 Answers

When the JVM first encounters the new String(new char[] {'a', 'b', 'h', 'i'}) string and you call intern() on it, the reference you just created becomes the canonical one and is stored in the string constant pool. Then "abhi" is pulled out from the constant pool - your canonical instance has been reused.

Your problem is that the literal "java" exists in the constant string pool before the start of your program - the JVM simply has it there for some use. Therefore, calling intern() on new String(new char[] {'j', 'a', 'v', 'a'}) does not intern your reference. Instead, it returns the pre-existing canonical value from the constant pool, and you happily ignore the return value.

You should not ignore the return value, but use it. You never know whether your "definitely original" string has not been living in the constant pool since the start of the JVM. Anyway, all of this is implementation dependent, you should either always use the references returned by the intern() method, or never. Do not mix between them.

like image 178
Petr Janeček Avatar answered Oct 16 '22 14:10

Petr Janeček


The answer by Petr Janeček is almost certainly correct (+1 there).

Really proving it is hard, because much of the string pool resides in the JVM itself, and one could hardly access it without a tweaked VM.

But here is some more evidence:

public class TestInternEx
{
    public static void main(String[] args)
    {
        char[] c1 = { 'a', 'b', 'h', 'i' };
        String s1 = new String(c1);
        String s1i = s1.intern();
        String s1s = "abhi";
        System.out.println(System.identityHashCode(s1));
        System.out.println(System.identityHashCode(s1i));
        System.out.println(System.identityHashCode(s1s));
        System.out.println(s1 == s1s);// true

        char[] cj =
        { 'j', 'a', 'v', 'a' };
        String sj = new String(cj);
        String sji = sj.intern();
        String sjs = "java";
        System.out.println(System.identityHashCode(sj));
        System.out.println(System.identityHashCode(sji));
        System.out.println(System.identityHashCode(sjs));
        System.out.println(sj == sjs);// false

        char[] Cj = { 'J', 'A', 'V', 'A' };
        String Sj = new String(Cj);
        String Sji = Sj.intern();
        String Sjs = "JAVA";
        System.out.println(System.identityHashCode(Sj));
        System.out.println(System.identityHashCode(Sji));
        System.out.println(System.identityHashCode(Sjs));
        System.out.println(Sj == Sjs);// true

        char[] ct =
        { 't', 'r', 'u', 'e' };
        String st = new String(ct);
        String sti = st.intern();
        String sts = "true";
        System.out.println(System.identityHashCode(st));
        System.out.println(System.identityHashCode(sti));
        System.out.println(System.identityHashCode(sts));
        System.out.println(st == sts);// false


    }
}

The program prints, for each string, the identity hash code of

  • the string that is created with new String
  • the string that is returned by String#intern
  • the string that is given as a literal

The output is along the lines of this:

366712642
366712642
366712642
true
1829164700
2018699554
2018699554
false
1311053135
1311053135
1311053135
true
118352462
1550089733
1550089733
false

One can see that for the String "java", the hash code of the new String is different from that of the string literal, but that the latter is the same as the one for the result of calling String#intern - which means that String#intern indeed returned a string that is deeply identical to the literal itself.

I also added the String "true" as another test case. It shows the same behavior, because one can assume that the string true will already have appeared before when bootstrapping the VM.

like image 3
Marco13 Avatar answered Oct 16 '22 15:10

Marco13


You are not using intern correctly. intern does not modify the string object it's called about (strings are immutable anyway), but returns the canonical representation of that string - which you are just discarding. Instead, you should assign it to a variable and use that variable in your checks. E.g.:

sj1 = sj1.intern();
like image 1
Mureinik Avatar answered Oct 16 '22 14:10

Mureinik