I have string that looks like this a👏b🙂c and I want to split it to single chars/strings.
static List<String> split(String text ) {
    List<String> list = new ArrayList<>(text.length());
    for(int i = 0; i < text.length() ; i++) {
        list.add(text.substring(i, i + 1));
    }
    return list;
}
public static void main(String... args) {
    split("a\uD83D\uDC4Fb\uD83D\uDE42c")
            .forEach(System.out::println);
}
As you might already notice instead of 👏 and 🙂 I'm getting two weird characters:
a
?
?
b
?
?
c
                As per Character and String APIs docs you need to use code points to correctly handle the UTF multi-byte sequences.
"a👏b🙂c".codePoints().mapToObj(Character::toChars).forEach(System.out::println);
will output
a
👏
b
🙂
c
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With