Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the last char in lexicography order?

Tags:

java

char

I'd like to know what is the last char that exists in java? I have a program which deals with words ordered in lexicography, I just wanna make sure at most that a certain word would be last this is why I wanna know what char should it be?

edited: I don't mean the last char from string. More simply, I'd like to know what should be the first char in a string that I'd like to be considered as last in lexic order - notice: when I use string.compareTo method?

like image 541
Popokoko Avatar asked Nov 21 '25 12:11

Popokoko


2 Answers

If you are talking about simple char values, then the answer is '\uffff'. (Java char values are really just signed 16-bit integers, and '\uffff' or 65535 is the largest integer representable by that type. The \u is Java's Unicode escape syntax.)

However, that ignores the fact that a single Java char instance can only represent Unicode codepoints that falls within Unicode plane 0 (the BMP). The standard currently defines planes 0 through 16. Code points in the higher planes are represented as pairs of Java char values; they are called surrogate pairs.

You will need to decide if your application needs to deal with surrogate pairs. (It depends whether you want to support text that uses "esoteric" characters in the higher Unicode planes.) If it does, then you won't be able to use the standard String.compareTo method and the like. I recommend that you take a look at the ICU libraries.

like image 127
Stephen C Avatar answered Nov 24 '25 02:11

Stephen C


It doesn't represent a valid Unicode character, but the largest value for a char, and therefore the "last" character, is 65535.

char omega = '\uFFFF';
like image 39
erickson Avatar answered Nov 24 '25 02:11

erickson