I tried to type char literals for accentuated vowels in Java, but the compilers says something like: unclosed character literal
This is what I'm trying to do:
char [] a = {'à', 'á', 'â', 'ä' };
I've tried using Unicode '\u00E0' but for some reason they don't match with my code:
for( char c : string.toCharArray() ) {
if( c == a[i] ) {
// I've found a funny letter
}
}
The if never evaluates to true, no matter what I put in my string.
Here's the complete program I'm trying to code.
The code should be compiled with the correct encoding:
javac -encoding UTF-8 Foo.java
There'll be an encoding mismatch there somewhere.
public class Foo {
char [] a = {'à', 'á', 'â', 'ä' };
}
The above code saved as UTF-8 should become the hex dump:
70 75 62 6C 69 63 20 63 6C 61 73 73 20 46 6F 6F public class Foo
20 7B 0D 0A 20 20 63 68 61 72 20 5B 5D 20 61 20 {__ char [] a
3D 20 7B 27 C3 A0 27 2C 20 27 C3 A1 27 2C 20 27 = {'__', '__', '
C3 A2 27 2C 20 27 C3 A4 27 20 7D 3B 20 20 0D 0A __', '__' }; __
7D 0D 0A 0D 0A }____
The UTF-8 value for code point U+00E0 (à) is C3 A0.
The code should be compiled with the correct encoding:
javac -encoding UTF-8 Foo.java
There is an outside chance that à will be represented by the combining sequence U+0061 U+0300. This is the NFD form (I've never come across a text editor that used it as a default for text entry). As Thorbjørn Ravn Andersen points out, it is often better to always use \uXXXX escape sequences - it is less ambiguous.
You also need to check your input device (file/console/etc.)
As a last resort, you can dump your chars as hex System.out.format("%04x", (int) c); and try manually decoding them with a character inspector to find out what they are.
For Unicode chacters to work, you must be certain that javac reads it in the same encoding as it is written.
You will save yourself a lot of trouble by just using the \uXXXX notation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With