I recently migrated an application for JBoss AS 5 to Wildfly 8, and as such had to move from Java 6 to Java 8.
I'm now encountering a problem when running one of my unit tests through Ant:
[javac] C:\Users\test\JAXBClassTest.java:123: error: unmappable character for encoding UTF8
Line 123 of the test class is:
Assert.assertEquals("Jµhn", JAXBClass.getValue());
This test is in place specifically to ensure that the JAXB marshaller can handle UTF-8 characters, which I believe µ
is. I have added a property onto the JAXB marshaller to ensure that these characters are allowed:
marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
I've seen multiple questions (1, 2, 3) on Stack Overflow which seem to be similar but their answers wither explain why invalid characters which were previously decoded one way are now decoded in another or don't appear to actually have the same issue as me.
If all the characters are valid should this cause an issue? I know I must be missing something but I can't see what.
The problem is that in your source code the µ
is encoded as \265
. Which is not valid for UTF-8. As UTF-8 encoding it is \uC2B5
.
In this source the character encoding for the file is ISO8859.
class Latin1 {
public static void main(String[] args) {
String s = "µ"; // \265
System.out.println(s);
}
}
Which can be compiled with ...
javac -encoding iso8859-1 Scratch.java
... but it fails with UTF-8 encoding
javac -encoding UTF-8 Latin1.java
Latin1.java:3: error: unmappable character for encoding UTF-8
String s = "?";
^
In this source the character encoding for the file is UTF-8.
class Utf8 {
public static void main(String[] args) {
String s = "µ"; // \uC2B5
System.out.println(s);
}
}
Which can be compiled with ISO8859-1 as well with UTF-8.
javac -encoding UTF-8 Utf8.java
javac -encoding iso8859-1 Utf8.java
edit In case copy and past from the webpage would alter the encoding. Both source files can be created as below, which should make the difference visible.
String latin1 = "class Latin1 {\n"
+ " public static void main(String[] args) {\n"
+ " String s = \"µ\";\n"
+ " System.out.println(s);\n"
+ " }\n"
+ "}";
Files.write(Paths.get("Latin1.java"),
latin1.getBytes(StandardCharsets.ISO_8859_1));
String utf8 = "class Utf8 {\n"
+ " public static void main(String[] args) {\n"
+ " String s = \"µ\";\n"
+ " System.out.println(s);\n"
+ " }\n"
+ "}";
Files.write(Paths.get("Utf8.java"),
utf8 .getBytes(StandardCharsets.UTF_8));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With