I have a test string:
String test = "oiwfoilfhlshflkshdlkfhsdlfhlskdhfslkhvslkvhvkjdhfkljshvdfkjhvdsköljhvskljdfhvblskjbkvljslkhjjssdlkhdsflksjflkjdlfjslkjljlfjslfjldfjjhvbksdjhbvslkdfjhbvslkjvhbslkvbjbn";
During debug I noticed following. When I print out the length:
System.out.println("Test length() : " + test.length());
returns
Test length() : 166
When I debug, I can read 333 as count for test variable.
What does the count represent?
String implementation contains an array of chars - value. So count field in some implementations is used for calculation of the array's declared size.
One could notice that the count value provided differs the given String length twice - this looks like a hint to ASCII/UTF-8/UTF-16 divergence as per 1 Unicode (UTF-16) symbol is represented by 2 bytes in a String instance.
An example:
String str = "f";
str.length(); // 1
str.getBytes().length; // 1
but
String str = "ў";
str.length(); // 1
str.getBytes().length; // 2
See also:
What JDK are you using? It may bring more light on what exactly your count is.
When asking android Java-related questions, always mention that as there are some major differences.
The android ART runtime optimizes java.lang.String
by compressing the normally two-byte Java chars into single-byte ASCII strings when possible. You can see it in the source of java.lang.String
:
public int length() {
// BEGIN Android-changed: Get length from count field rather than value array (see above).
// return value.length;
final boolean STRING_COMPRESSION_ENABLED = true;
if (STRING_COMPRESSION_ENABLED) {
// For the compression purposes (save the characters as 8-bit if all characters
// are ASCII), the least significant bit of "count" is used as the compression flag.
return (count >>> 1);
} else {
return count;
}
}
String compression is specified in the native code as:
// String Compression
static constexpr bool kUseStringCompression = true;
enum class StringCompressionFlag : uint32_t {
kCompressed = 0u,
kUncompressed = 1u
};
This flag is OR-ed with the count
value:
static int32_t GetFlaggedCount(int32_t length, bool compressible) {
return kUseStringCompression
? static_cast<int32_t>((static_cast<uint32_t>(length) << 1) |
(static_cast<uint32_t>(compressible
? StringCompressionFlag::kCompressed
: StringCompressionFlag::kUncompressed)))
: length;
}
When loading strings from the constant pool, however, string compression is not performed. Hence you get a doubling of the original char count + 1 (333 = 166 * 2 + 1). That additional 1 is the "uncompressed" flag.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With