Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode lowercase characters?

I read up someplace, that there are characters other than A-Z that have a lowercase equivalent, in Unicode. Which could these be, and why would any other character need an upper and lower case?

like image 964
Robin Rodricks Avatar asked May 30 '09 05:05

Robin Rodricks


2 Answers

The English language, and even that strange variant, American English :-) , is not the only language on the planet. There are some very strange looking ones (at least to those familiar with the Latin-based characters) but even Latin-based ones have minor variations.

Two of which I am acquainted with on more than a casual basis are Greek and German:

Αα Ββ Γγ Δδ Εε Ζζ  Ηη Θθ Ιι Κκ Λλ Μμ
Νν Ξξ Οο Ππ Ρρ Σσς Ττ Υυ Φφ Χχ Ψψ Ωω

Aa Ää Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn
Oo Öö Pp Qq Rr Ss ß  Tt Uu Üü Vv Ww Xx Yy Zz

That's why we're not allowed to use bits of code like:

char lower = upper - 'A' + 'a';

any more. Doing something like that in a company that takes i18n seriously is near grounds for dismissal. Using Unicode-aware toLower()/toUpper()-type functions is the better way to go.

like image 76
paxdiablo Avatar answered Sep 20 '22 03:09

paxdiablo


There's a lot of alphabets other than the usual Latin-derived western European alphabet most of us are used to seeing here. To start with, you'd need uppercase and lowercase versions of accented letters and ligatures, like Àà, IJij, and so on. There's also the fullwidth versions of Latin characters used when setting documents in Asian languages (which I'm too lazy to look up). Further, there are the other alphabets in use nowadays, like the Cyrillic (Бб) and Greek (Δδ) alphabets.

There's also Turkey, which is just kind of difficult according to Jeff Atwood. Using the uppercasing/lowercasing functions provided by your environment are (usually) the way to go with user-input data.

like image 24
Paul Fisher Avatar answered Sep 18 '22 03:09

Paul Fisher