Is there any use case where having codePointBefore() would be advantageous? If you have the index you can already codePointAt(i-1).. ?
A code point may consist of multiple char's which are still only 16-bit unicode. The index given to the methods in String in an index of it's underlying array char[] value not the index of a code point. These check bounds and wrap methods of Character:
//Java 8 java.lang.String source code
public int codePointAt(int index) {
if ((index < 0) || (index >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return Character.codePointAtImpl(value, index, value.length);
}
//...
public int codePointBefore(int index) {
int i = index - 1;
if ((i < 0) || (i >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return Character.codePointBeforeImpl(value, index, 0);
}
the corresponding methods in Character identify and combine multiple char that belong to a single code point:
//Java 8 java.lang.Character source code
static int codePointAtImpl(char[] a, int index, int limit) {
char c1 = a[index];
if (isHighSurrogate(c1) && ++index < limit) {
char c2 = a[index];
if (isLowSurrogate(c2)) {
return toCodePoint(c1, c2);
}
}
return c1;
}
//...
static int codePointBeforeImpl(char[] a, int index, int start) {
char c2 = a[--index];
if (isLowSurrogate(c2) && index > start) {
char c1 = a[--index];
if (isHighSurrogate(c1)) {
return toCodePoint(c1, c2);
}
}
return c2;
}
The difference is important because index-1 is not always the start of the previous code point; So codePointBefore() needs to start at index-1 and look backwards, while codePointAt() needs to starts at index and look forward.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With