I had (apparenty an incorrect?) impression that Java substring(srcArray, startIndex, endIndex) method did not allocate new memory but instead re-used the existing underlying char[] array. This approach would seem to be possible due to the immutability of String's.
However when actually looking at the JDK source we find the following:
public String substring(int beginIndex, int endIndex) {
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
}
public String(char value[], int offset, int count) {
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
Notice the copyOfRange which copies/duplicates the content.
The motivation for this was working with very large arrays representing the backing store of matrices for linear algebra. There is NO way we are going to accept copying them. I would like however to have a custom representation of portions of the backing array for convenience in programming. It would however only be reasonable if no data copy were to be performed. The tie-in with String is that I looked into how the String handles no-data-copy and found that they (in later java versions apparently..) do actually perform the copy.
So, is there actually any way to achieve the zero-copy?
In particular, because of changes in string interning in .NET Core 3.0, in some cases the Copy method will not create a new string but will simply return a reference to an existing interned string. Depending on Why you want to call the Copy method, there are a number of alternatives:
A new string with the same value as str. str is null. The Copy method returns a String object that has the same value as the original string but represents a different object reference. It differs from an assignment operation, which assigns an existing string reference to an additional object variable.
We basically need to find if one string contains characters which are subset of characters in second string. First we count occurrences of all characters in second string. Then we traverse through first string and reduce count of every character that is present in first. If at any moment, count becomes less than 0, we return false.
So creating a new string from a substring operation (which is itself a new string) was indeed redundant, as per the preceding diagram (except for one case, see below).
It used to be that way, but it was changed; the designers of Java came to the conclusion that it caused more problems than it solved, specifically due to a) the risk of memory leaks, b) the fact that most substring
calls are for relatively small ranges that are cheap to copy.
The closest thing you can do, to get a CharSequence
view of a subrange of the String
without copying, you can do by writing CharBuffer.wrap(string).subSequence(from, to)
.
The class java.nio.StringCharBuffer
implements such a "view" on a substring without copying of the underlying char[] array of the String. It uses the String.charAt
functions along with an internal offset (number of chars offset from the start of the String) in order to access the String array.
Unfortunately, it is package-private and not usable directly. The superclass java.nio.CharBuffer
however is public, and you could implement your own implementation of a StringCharBuffer using the same approach that the java.nio.StringCharBuffer
takes.
Note that if you do not do this and simply use CharBuffer.wrap
as suggested by another answer you will still copy the array once.
Alternatively, you might just require a Stream of chars beginning from a certain index. In that case, you can use something like:
IntStream
.range(start, str.length())
.map(i -> str.charAt(i))
...
to get a Stream of chars. Note that this is different from doing str.chars().skip(start)
, as the latter will actually advance the cursor by 1 start
times.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With