Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between "text" and new String("text")?

Tags:

java

string

People also ask

What is the difference between text and string?

Both a string and text field will hold information that you can freely write in. The major difference between the two fields is how many characters you can put in these fields. A string field has a limit of 255 characters, whereas a text field has a character limit of 30,000 characters.

What does new string mean in Java?

By new keyword : Java String is created by using a keyword “new”. For example: String s=new String(“Welcome”); It creates two objects (in String pool and in heap) and one reference variable where the variable 's' will refer to the object in the heap.

What is the difference between creating string as new () and literal?

When we create a String object using the new() operator, it always creates a new object in heap memory. On the other hand, if we create an object using String literal syntax e.g. “Baeldung”, it may return an existing object from the String pool, if it already exists.


new String("text"); explicitly creates a new and referentially distinct instance of a String object; String s = "text"; may reuse an instance from the string constant pool if one is available.

You very rarely would ever want to use the new String(anotherString) constructor. From the API:

String(String original) : Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since strings are immutable.

Related questions

  • Java Strings: “String s = new String(”silly“);”
  • Strings are objects in Java, so why don’t we use ‘new’ to create them?

What referential distinction means

Examine the following snippet:

    String s1 = "foobar";
    String s2 = "foobar";

    System.out.println(s1 == s2);      // true

    s2 = new String("foobar");
    System.out.println(s1 == s2);      // false
    System.out.println(s1.equals(s2)); // true

== on two reference types is a reference identity comparison. Two objects that are equals are not necessarily ==. It is usually wrong to use == on reference types; most of the time equals need to be used instead.

Nonetheless, if for whatever reason you need to create two equals but not == string, you can use the new String(anotherString) constructor. It needs to be said again, however, that this is very peculiar, and is rarely the intention.

References

  • JLS 15.21.3 Reference Equality Operators == and !=
  • class Object - boolean Object(equals)

Related issues

  • Java String.equals versus ==
  • How do I compare strings in Java?

String literals will go into String Constant Pool.

The below snapshot might help you to understand it visually to remember it for longer time.

enter image description here


Object creation line by line:

String str1 = new String("java5");

Using string literal "java5" in the constructor, a new string value is stored in string constant pool. Using new operator, a new string object is created in the heap with "java5" as value.

String str2 = "java5"

Reference "str2" is pointed to already stored value in string constant pool

String str3 = new String(str2);

A new string object is created in the heap with the same value as reference by "str2"

String str4 = "java5";

Reference "str4" is pointed to already stored value in string constant pool

Total objects : Heap - 2, Pool - 1

Further reading on Oracle community


One creates a String in the String Constant Pool

String s = "text";

the other one creates a string in the constant pool ("text") and another string in normal heap space (s). Both strings will have the same value, that of "text".

String s = new String("text");

s is then lost (eligible for GC) if later unused.

String literals on the other hand are reused. If you use "text" in multiple places of your class it will in fact be one and only one String (i.e. multiple references to the same string in the pool).


JLS

The concept is called "interning" by the JLS.

Relevant passage from JLS 7 3.10.5:

Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

Example 3.10.5-1. String Literals

The program consisting of the compilation unit (§7.3):

package testPackage;
class Test {
    public static void main(String[] args) {
        String hello = "Hello", lo = "lo";
        System.out.print((hello == "Hello") + " ");
        System.out.print((Other.hello == hello) + " ");
        System.out.print((other.Other.hello == hello) + " ");
        System.out.print((hello == ("Hel"+"lo")) + " ");
        System.out.print((hello == ("Hel"+lo)) + " ");
        System.out.println(hello == ("Hel"+lo).intern());
    }
}
class Other { static String hello = "Hello"; }

and the compilation unit:

package other;
public class Other { public static String hello = "Hello"; }

produces the output:

true true true true false true

JVMS

JVMS 7 5.1 says:

A string literal is a reference to an instance of class String, and is derived from a CONSTANT_String_info structure (§4.4.3) in the binary representation of a class or interface. The CONSTANT_String_info structure gives the sequence of Unicode code points constituting the string literal.

The Java programming language requires that identical string literals (that is, literals that contain the same sequence of code points) must refer to the same instance of class String (JLS §3.10.5). In addition, if the method String.intern is called on any string, the result is a reference to the same class instance that would be returned if that string appeared as a literal. Thus, the following expression must have the value true:

("a" + "b" + "c").intern() == "abc"

To derive a string literal, the Java Virtual Machine examines the sequence of code points given by the CONSTANT_String_info structure.

  • If the method String.intern has previously been called on an instance of class String containing a sequence of Unicode code points identical to that given by the CONSTANT_String_info structure, then the result of string literal derivation is a reference to that same instance of class String.

  • Otherwise, a new instance of class String is created containing the sequence of Unicode code points given by the CONSTANT_String_info structure; a reference to that class instance is the result of string literal derivation. Finally, the intern method of the new String instance is invoked.

Bytecode

It is also instructive to look at the bytecode implementation on OpenJDK 7.

If we decompile:

public class StringPool {
    public static void main(String[] args) {
        String a = "abc";
        String b = "abc";
        String c = new String("abc");
        System.out.println(a);
        System.out.println(b);
        System.out.println(a == c);
    }
}

we have on the constant pool:

#2 = String             #32   // abc
[...]
#32 = Utf8               abc

and main:

 0: ldc           #2          // String abc
 2: astore_1
 3: ldc           #2          // String abc
 5: astore_2
 6: new           #3          // class java/lang/String
 9: dup
10: ldc           #2          // String abc
12: invokespecial #4          // Method java/lang/String."<init>":(Ljava/lang/String;)V
15: astore_3
16: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
19: aload_1
20: invokevirtual #6          // Method java/io/PrintStream.println:(Ljava/lang/String;)V
23: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
26: aload_2
27: invokevirtual #6          // Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
33: aload_1
34: aload_3
35: if_acmpne     42
38: iconst_1
39: goto          43
42: iconst_0
43: invokevirtual #7          // Method java/io/PrintStream.println:(Z)V

Note how:

  • 0 and 3: the same ldc #2 constant is loaded (the literals)
  • 12: a new string instance is created (with #2 as argument)
  • 35: a and c are compared as regular objects with if_acmpne

The representation of constant strings is quite magic on the bytecode:

  • it has a dedicated CONSTANT_String_info structure, unlike regular objects (e.g. new String)
  • the struct points to a CONSTANT_Utf8_info Structure that contains the data. That is the only necessary data to represent the string.

and the JVMS quote above seems to say that whenever the Utf8 pointed to is the same, then identical instances are loaded by ldc.

I have done similar tests for fields, and:

  • static final String s = "abc" points to the constant table through the ConstantValue Attribute
  • non-final fields don't have that attribute, but can still be initialized with ldc

Conclusion: there is direct bytecode support for the string pool, and the memory representation is efficient.

Bonus: compare that to the Integer pool, which does not have direct bytecode support (i.e. no CONSTANT_String_info analogue).


Any String literal gets created inside string literal pool and the pool doesn't allow any duplicates. Thus if two or more string objects are initialized with the same literal value then all objects will point to the same literal.

String obj1 = "abc";
String obj2 = "abc";

"obj1" and "obj2" will point to the same string literal and the string literal pool will have only one "abc" literal.

When we create a String class object using the new keyword the string thus created is stored in heap memory. Any string literal passed as parameter to the constructor of String class however is stored in string pool. If we create multiple objects using the same value with the new operator a new object will be created in the heap each time, because of this new operator should be avoided.

String obj1 = new String("abc");
String obj2 = new String("abc");

"obj1" and "obj2" will point to two different objects in the heap and the string literal pool will have only one "abc" literal.

Also something that is worth noting with regards to the behavior of strings is that any new assignment or concatenation done on string creates a new object in memory.

String str1 = "abc";
String str2 = "abc" + "def";
str1 = "xyz";
str2 = str1 + "ghi";

Now in the above case:
Line 1: "abc" literal is stored in string pool.
Line 2: "abcdef" literal gets stored in the string pool.
Line 3: A new "xyz" literal is stored in the string pool and "str1" starts to point to this literal.
Line 4: Since the value is generated by appending to another variable the result is stored in the heap memory and the literal being appended "ghi" will be checked for its existence in the string pool and will be created since it doesn't exist in the above case.