Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it a good idea to use unicode symbols as Java identifiers?

I have a snippet of code that looks like this:

double Δt = lastPollTime - pollTime;
double α = 1 - Math.exp(-Δt / τ);
average += α * (x - average);

Just how bad an idea is it to use unicode characters in Java identifiers? Or is this perfectly acceptable?

like image 257
Eric Avatar asked May 08 '10 11:05

Eric


People also ask

Can you use Unicode in Java?

Unicode sequences can be used everywhere in Java code. As long as it contains Unicode characters, it can be used as an identifier. You may use Unicode to convey comments, ids, character content, and string literals, as well as other information.

Which of the following symbols can be used in identifiers in Java?

The only allowed characters for identifiers are all alphanumeric characters([A-Z],[a-z],[0-9]), '$'(dollar sign) and '_' (underscore). For example “geek@” is not a valid java identifier as it contain '@' special character. Identifiers should not start with digits([0-9]).

Is Java ASCII or Unicode?

Java actually uses Unicode, which includes ASCII and other characters from languages around the world.

What does Unicode mean in Java?

Unicode is a computing industry standard designed to consistently and uniquely encode characters used in written languages throughout the world. The Unicode standard uses hexadecimal to express a character. For example, the value 0x0041 represents the Latin character A.


3 Answers

It's a bad idea, for various reasons.

  • Many people's keyboards do not support these characters. If I were to maintain that code on a qwerty keyboard (or any other without Greek letters), I'd have to copy and paste those characters all the time.

  • Some people's editors or terminals might not display these characters properly. For example, some editors (unfortunately) still default to some ISO-8859 (Latin) variant. The main reason why ASCII is still so prevalent is that it nearly always works.

  • Even if the characters can be rendered properly, they may cause confusion. Straight from Sun (emphasis mine):

    Identifiers that have the same external appearance may yet be different. For example, the identifiers consisting of the single letters LATIN CAPITAL LETTER A (A, \u0041), LATIN SMALL LETTER A (a, \u0061), GREEK CAPITAL LETTER ALPHA (A, \u0391), CYRILLIC SMALL LETTER A (a, \u0430) and MATHEMATICAL BOLD ITALIC SMALL A (a, \ud835\udc82) are all different.

    ...

    Unicode composite characters are different from the decomposed characters. For example, a LATIN CAPITAL LETTER A ACUTE (Á, \u00c1) could be considered to be the same as a LATIN CAPITAL LETTER A (A, \u0041) immediately followed by a NON-SPACING ACUTE (´, \u0301) when sorting, but these are different in identifiers.

    This is in no way an imaginary problem: α (U+03b1 GREEK SMALL LETTER ALPHA) and ⍺ (U+237a APL FUNCTIONAL SYMBOL ALPHA) are different characters!

  • There is no way to tell which characters are valid. The characters from your code work, but when I use the FUNCTIONAL SYMBOL ALPHA my Java compiler complains about "illegal character: \9082". Even though the functional symbol would be more appropriate in this code. There seems to be no solid rule about which characters are acceptable, except asking Character.isJavaIdentifierPart().

  • Even though you may get it to compile, it seems doubtful that all Java virtual machine implementations have been rigorously tested with Unicode identifiers. If these characters are only used for variables in method scope, they should get compiled away, but if they are class members, they will end up in the .class file as well, possibly breaking your program on buggy JVM implementations.

like image 173
Thomas Avatar answered Oct 09 '22 13:10

Thomas


looks good as it uses the correct symbols, but how many of your team will know the keystrokes for those symbols?

I would use an english representation just to make it easier to type. And others might not have a character set that supports those symbols set up on their pc.

like image 9
Mauro Avatar answered Oct 09 '22 14:10

Mauro


That code is fine to read, but horrible to maintain - I suggest use plain English identifiers like so:

double deltaTime = lastPollTime - pollTime;
double alpha = 1 - Math.exp(-delta....
like image 7
Crozin Avatar answered Oct 09 '22 14:10

Crozin