These days, more languages are using unicode, which is a good thing. But it also presents a danger. In the past there where troubles distinguising between 1 and l and 0 and O. But now we have a complete new range of similar characters.
For example:
ì, î, ï, ı, ι, ί, ׀ ,أ ,آ, ỉ, ﺃ
With these, it is not that difficult to create some very hard to find bugs.
At my work, we have decided to stay with the ANSI characters for identifiers. Is there anybody out there using unicode identifiers and what are the experiences?
The first version of Unicode was introduced in 1991. Unicode character set was designed to include all the characters available in all the languages/scripts of the world.
The Java compiler works on Unicode characters. Our Java source file is normally encoded in ASCII or some extension of ASCII. While decoding from ASCII to Unicode, the compiler would first replace the Unicode escapes in the Java file with the actual Unicode character value.
In the Unicode character set, there is no provision for removing or updating any character, so newer versions of Unicode can only add new characters and it may deprecate any existing characters. The blocks for the South Central and South East Asian Scripts in Unicode are summarized in Tables 3 to 7. What is the size of char in C?
In Unicode standard, the range of code-point values from D800 to DFFF (Hex) has not been assigned to any valid character and is reserved for surrogates. For characters in the range of 0000 —FFFF (Hex), the values of code-points and UTF-16 code units are the same.
Besides the similar character bugs you mention and the technical issues that might arise when using different editors (w/BOM, wo/BOM, different encodings in the same file by copy pasting which is only a problem when there are actually characters that cannot be encoded in ASCII and so on), I find that it's not worth using Unicode characters in identifiers. English has become the lingua franca of development and you should stick to it while writing code.
This I find particularly true for code that may be seen anywhere in the world by any developer (open source, or code that is sold along with the product).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With