Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Javascript's toUpperCase() language safe?

Will Javascript's String prototype method toUpperCase() deliver the naturally expected result in every UTF-8-supported language/charset?

I've tried simplified chinese, south korean, tamil, japanese and cyrillic and the results seemed reasonable so far. Can I rely on the method being language-safe?

Example:

  "イロハニホヘトチリヌルヲワカヨタレソツネナラムウヰノオクヤマケフコエテアサキユメミシヱヒモセス".toUpperCase()
> "イロハニホヘトチリヌルヲワカヨタレソツネナラムウヰノオクヤマケフコエテアサキユメミシヱヒモセス"

Edit: As @Quentin pointed out, there also is a String.prototype.toLocaleUpperCase() which is probably even "safer" to use, but I also have to support IE 8 and above, as well as Webkit-based browsers. Since it is part of ECMAScript 3 Standard, it should be available on all those browsers, right?

Does anyone know of any cases where using it delivers naturally unexpected results?

like image 961
connexo Avatar asked Jun 10 '15 17:06

connexo


1 Answers

What do you expect?

JavaScript's toUpperCase() method is supposed to use the "locale invariant upper case mapping" as defined by the Unicode standard. So, basically, "i".toUpperCase() is supposed to be I in all cases. In cases where the locale invariant upper case mapping consists of multiple letters, most browsers will not upper case them correctly, for example "ß".toUpperCase() is often not SS.

Also, there are locales that have different uppercase rules than the rest of the world, the most notable example being Turkish, where the uppercase version of i is İ (and vice versa) and the lowercase version of I is ı (and vice versa).

If you want that behaviour, you will need a browser that is set to Turkish locale, and you have to use the toLocaleUpperCase() method.

Also note that some writing systems have a third case, "title case", which is applied to the first letter of a word when you want to "capitalize" it. This is also defined by the Unicode standard (for example, the Title case of the ligature njis Nj while the upper case is NJ), but (as far as I know) not available to JavaScript. Therefore if you try to capitalize a word using substring and toUpperCase, expect it to be wrong in rare cases.

like image 66
mihi Avatar answered Oct 11 '22 11:10

mihi