The below code is in Class String in java. I don't understand why the characters from two different strings are compared twice. at first by doing upper case and if that fails by doing lower case.
My Question here is, is it required? If yes, why?
public static final Comparator<String> CASE_INSENSITIVE_ORDER
= new CaseInsensitiveComparator();
private static class CaseInsensitiveComparator
implements Comparator<String>, java.io.Serializable {
// use serialVersionUID from JDK 1.2.2 for interoperability
private static final long serialVersionUID = 8575799808933029326L;
public int compare(String s1, String s2) {
int n1 = s1.length();
int n2 = s2.length();
int min = Math.min(n1, n2);
for (int i = 0; i < min; i++) {
char c1 = s1.charAt(i);
char c2 = s2.charAt(i);
if (c1 != c2) {
c1 = Character.toUpperCase(c1);
c2 = Character.toUpperCase(c2);
if (c1 != c2) {
c1 = Character.toLowerCase(c1);
c2 = Character.toLowerCase(c2);
if (c1 != c2) {
// No overflow because of numeric promotion
return c1 - c2;
}
}
}
}
return n1 - n2;
}
}
The terms “uppercase" and “lowercase" come from the way in which print shops were organized hundreds of years ago. Individual pieces of metal type were kept in boxes called cases. The smaller letters, which were used most often, were kept in a lower case that was easier to reach.
To use a keyboard shortcut to change between lowercase, UPPERCASE, and Capitalize Each Word, select the text and press SHIFT + F3 until the case you want is applied.
Results suggest that upper-case is more legible than the other case styles, especially for visually-impaired readers, because smaller letter sizes can be used than with the other case styles, with no diminution of legibility.
The issue might be more complex.
There are characters, where there are multiple lowercase codepoints for the same uppercase codepoint or vice versa. So to check for case insensitive match, you need to compare both upper and lowercase versions if one of them matches.
One example being
The Greek upper-case letter "Σ" has two different lower-case forms: "ς" in word-final position and "σ" elsewhere.
Source: Wikipedia
For upper case not equal but lowercase very much so, VGR supplied this excellent example:
A better example would be '\u0130' (İ) and 'I'. Passing them through toUpperCase leaves them unchanged (and therefore different), but passing them through toLowerCase results in identical character values
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With