I'm writing Unicode characters to a Log4J2 log. On one machine (Windows 8) I see this in the log:
2016-08-30 16:44:00.958|English: The quick brown fox jumped over the lazy dog.
2016-08-30 16:44:00.960|German: Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.
2016-08-30 16:44:00.960|Russian 1: В чащах юга жил бы цитрус? Да, но фальшивый экземпляр!
2016-08-30 16:44:00.960|Russian 2: Съешь же ещё этих мягких французских булок да выпей чаю.
2016-08-30 16:44:00.960|Chinese: 中国智造,慧及全球
2016-08-30 16:44:00.960|Japanese: いろはにほへと ちりぬるを わかよたれそ つねならむ うゐのおくやま けふこえて あさきゆめみし ゑひもせす
2016-08-30 16:44:00.960|Korean: 다람쥐 헌 쳇바퀴에 타고파
On another machine (Windows Server 2012R2) I see this:
2016-08-30 16:50:41.676|English: The quick brown fox jumped over the lazy dog.
2016-08-30 16:50:41.676|German: Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.
2016-08-30 16:50:41.676|Russian 1: ? ????? ??? ??? ?? ??????? ??, ?? ????????? ?????????!
2016-08-30 16:50:41.676|Russian 2: ????? ?? ??? ???? ?????? ??????????? ????? ?? ????? ???.
2016-08-30 16:50:41.676|Chinese: ?????????
2016-08-30 16:50:41.676|Japanese: ??????? ????? ?????? ????? ??????? ????? ??????? ?????
2016-08-30 16:50:41.676|Korean: ??? ? ???? ???
If Log4J2 writes UTF-8 by default, why does the log file on the 2nd system contain only question marks? That is, the second system may (and probably is) missing fonts, but the log file itself on the 2nd system contains actual question marks when, using a hexdump tool, I would expect to see at least the binary for the UTF-8 characters in the file. Put another way, I can understand why an unknown character might render incorrectly, I just don't understand why the correct Unicode was not written to the file, if the process doing the writing is the JVM, which uses Unicode for characters.
Configuration: the root element of a log4j2 configuration file; the status attribute represents the level at which internal log4j events should be logged. Appenders: this element contains a list of appenders; in our example, an appender corresponding to the System console is defined.
Community support: Log4j 1. x is not actively maintained, whereas Log4j 2 has an active community where questions are answered, features are added and bugs are fixed. Automatically reload its configuration upon modification without losing log events while reconfiguring.
Log4j2 RollingFileAppender is an OutputStreamAppender that writes log messages to files, following a configured triggering policy about when a rollover (backup) should occur. It also has a configured rollover strategy about how to rollover the file.
Did you try to enforce the UTF-8 charset for your Log4j Layout, inside your Log4j configuration file? For example, using PatternLayout
:
<Configuration ...>
...
<PatternLayout pattern="..." charset="UTF-8"/>
...
</Configuration>
See https://logging.apache.org/log4j/2.x/manual/layouts.html for more information on Log4j encoding issues.
The default charset for the pattern layout in Log4j 2 is the system default charset, not UTF8. Other layouts may have a different default charset, this is documented in the manual page for each layout.
As indicated in the other answer, you can specify the charset in the layout's configuration.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With