In the source code of com.sun.org.apache.xerces.internal.impl.XMLScanner at line 183 and 186
183 protected final static String fVersionSymbol = "version".intern();
186 protected final static String fEncodingSymbol = "encoding".intern();
Why "version" and "encoding" are explicitly interned by using intern() while they are string literals and would get automatically interned?
I've tracked down the change to revision 318617 in the Apache Xerces SVN Repository (this is the project where this XML parser was initially developed, as the package name suggests).
The relevant part of the commit message is:
Trying to improve the use of symbol tables. Many predefined Strings are added to symbol tables every time the parser is reset. For small documents, this would be a significant cost. Now since we call String#intern for Strings in the symbol table, it's sufficient to use String#intern for those predefined symbols. This only needs to be performed once.
As you noted, the .intern()
should not be necessary (and should have no visible effect) on a conforming JVM implementation.
My guess is that
In the second case I'd expect some note of that in a comment or in the comment message, however.
One side-effect of that .intern()
call is that initializers are no longer constant expressions and the fields will not be inlined by other classes referencing them.That will ensure that the class XMLScanner
is loaded and its field read. I don't think this is relevant here, however.
I don't believe there's any good reason for that, for the reason you identified: Literals are always automatically interned, as defined by the String
class:
All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of the The Java™ Language Specification.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With