I'm looking at the openjdk implementation of String and the private, per instance members look like:
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence
{
/** The value is used for character storage. */
private final char value[];
/** The offset is the first index of the storage that is used. */
private final int offset;
/** The count is the number of characters in the String. */
private final int count;
/** Cache the hash code for the string */
private int hash; // Default to 0
[...]
}
But I know that Java uses reference and pools for Strings, to avoid duplication. I was naively expecting a pimpl idiom, where String would in fact be just a ref to an impl. I'm not seeing that so far. Can someone explain how Java will know to use references if I put a String x; member in one of my classes?
Addendum: this is probably wrong, but if I'm in 32 bits mode, should I count: 4 bytes for the reference "value[]", 4 bytes for offset, 4 for count and 4 for hash for everything instance of class String? That would mean that writing "String x;" in one of my class automatically adds at least 32 bytes to the "weight" of my class (I'm probably wrong here).
The offset/count fields are somewhat orthogonal to the pooling/intern()
issues. Offset and count come when you have something like:
String substring = myString.substring(5);
One way to implement this method would be something like:
char[]
with myString.length() - 5
elementsmyString.length()
from myString to the new char[]
substring
is constructed with this new char[]
substring.charAt(i)
goes directly to chars[i]
substring.length()
goes directly to chars.length
As you san see, this approach is O(N) -- where N is the new string's length -- and requires two allocations: the new String, and the new char[]. So instead, substring
works by resusing the original char[] but with an offset:
substring.offset
= myString.offset + newOffset
substring.count
= myString.count - newOffset
myString.chars
as the chars array for substring
substring.charAt(i)
goes to chars[i+substring.offset]
substring.length()
goes to substring.count
Note that we didn't need to create a new char[], and more importantly, we didn't need to copy the chars from the old char[] to the new one (since there is no new one). So this operation is just O(1) and requires only one allocation, that of the new String.
Java always uses references to any object. There's no way to make it not use references. As for string pooling, that is achieved by the compiler for string literals and at runtime by calling String.intern
. It is natural that most of the implementation of String
is oblivious to whether it is dealing with an instance referred to by the constant pool or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With