I have string that looks like this a👏b🙂c
and I want to split it to single chars/strings.
static List<String> split(String text ) {
List<String> list = new ArrayList<>(text.length());
for(int i = 0; i < text.length() ; i++) {
list.add(text.substring(i, i + 1));
}
return list;
}
public static void main(String... args) {
split("a\uD83D\uDC4Fb\uD83D\uDE42c")
.forEach(System.out::println);
}
As you might already notice instead of 👏 and 🙂 I'm getting two weird characters:
a
?
?
b
?
?
c
As per Character and String APIs docs you need to use code points to correctly handle the UTF multi-byte sequences.
"a👏b🙂c".codePoints().mapToObj(Character::toChars).forEach(System.out::println);
will output
a
👏
b
🙂
c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With