I need to parse UTF-8 input (from a text file) character by character (and by character I mean full UTF-8 character (UTF-8 code point), not Java's char).
What approach should I use?
There's CharSequence.codePoints()
For example:
String text = Files.readString(Path.of("test.txt"));
IntStream codePoints = text.codePoints();
// do something with the code points
codePoints.forEach(codePoint -> System.out.println(codePoint));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With