How many unique characters exist in Java? I've looped to over 10,000, and characters are still being found:
for (int i = 0; i < 10000; i++)
System.out.println((char) i);
Are there Integer.MAX characters? I always thought there was only 255 for some reason
Java uses Unicode. Unicode code points are from U+0000 to U+10FFFF, which makes quite a lot.
But not all of them are defined. If you want to know how many of them are "supported", you can use that:
final long nrChars = IntStream.rangeClosed(0, 0x10ffff)
.mapToObj(Character.UnicodeBlock::of)
.filter(Objects::nonNull)
.count();
Also note that due to historical reasons, Java's char
can only represent directly code points up to U+FFFF. For the "rest" (which is now pretty much the majority of defined code points), Java uses a surrogate pair. See Character.toChars()
.
Java was designed to use internally Unicode, so diverse scripts could be combined in one String. Unicode is a numbering of all scripts going into the 3 byte range. Such Unicode "code points" are represented as int
in java.
At that time char
and String
were for text, char using UTF-16 (an Unicode representation using 16 bits, sometime with two chars for a Unicode code point. (However String constants in a .class file are in UTF-8.)
char
hence takes 2 bytes.
byte
takes 1 byte and byte[]
is for binary data.
In earlier languages (C, C++) there was often no such distinction between char
and byte
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With