Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

automatically interning of string literals

Tags:

java

string

In the source code of com.sun.org.apache.xerces.internal.impl.XMLScanner at line 183 and 186

183    protected final static String fVersionSymbol = "version".intern();

186    protected final static String fEncodingSymbol = "encoding".intern();

Why "version" and "encoding" are explicitly interned by using intern() while they are string literals and would get automatically interned?

like image 727
a Learner Avatar asked Nov 02 '12 14:11

a Learner


2 Answers

I've tracked down the change to revision 318617 in the Apache Xerces SVN Repository (this is the project where this XML parser was initially developed, as the package name suggests).

The relevant part of the commit message is:

Trying to improve the use of symbol tables. Many predefined Strings are added to symbol tables every time the parser is reset. For small documents, this would be a significant cost. Now since we call String#intern for Strings in the symbol table, it's sufficient to use String#intern for those predefined symbols. This only needs to be performed once.

As you noted, the .intern() should not be necessary (and should have no visible effect) on a conforming JVM implementation.

My guess is that

  • either the author was not aware of the fact that string literals will always be interned
  • or it was a conscious decision to ward against a misbehaving JVM implementation

In the second case I'd expect some note of that in a comment or in the comment message, however.

One side-effect of that .intern() call is that initializers are no longer constant expressions and the fields will not be inlined by other classes referencing them.That will ensure that the class XMLScanner is loaded and its field read. I don't think this is relevant here, however.

like image 66
Joachim Sauer Avatar answered Nov 05 '22 14:11

Joachim Sauer


I don't believe there's any good reason for that, for the reason you identified: Literals are always automatically interned, as defined by the String class:

All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of the The Java™ Language Specification.

like image 4
T.J. Crowder Avatar answered Nov 05 '22 15:11

T.J. Crowder