Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Log4J2 output differ on two systems when I am writing the same UTF-8?

I'm writing Unicode characters to a Log4J2 log. On one machine (Windows 8) I see this in the log:

2016-08-30 16:44:00.958|English:  The quick brown fox jumped over the lazy dog.
2016-08-30 16:44:00.960|German:  Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.
2016-08-30 16:44:00.960|Russian 1:  В чащах юга жил бы цитрус? Да, но фальшивый экземпляр!
2016-08-30 16:44:00.960|Russian 2:  Съешь же ещё этих мягких французских булок да выпей чаю.
2016-08-30 16:44:00.960|Chinese:  中国智造,慧及全球
2016-08-30 16:44:00.960|Japanese:  いろはにほへと ちりぬるを わかよたれそ つねならむ うゐのおくやま けふこえて あさきゆめみし ゑひもせす
2016-08-30 16:44:00.960|Korean:  다람쥐 헌 쳇바퀴에 타고파

On another machine (Windows Server 2012R2) I see this:

2016-08-30 16:50:41.676|English:  The quick brown fox jumped over the lazy dog.
2016-08-30 16:50:41.676|German:  Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.
2016-08-30 16:50:41.676|Russian 1:  ? ????? ??? ??? ?? ??????? ??, ?? ????????? ?????????!
2016-08-30 16:50:41.676|Russian 2:  ????? ?? ??? ???? ?????? ??????????? ????? ?? ????? ???.
2016-08-30 16:50:41.676|Chinese:  ?????????
2016-08-30 16:50:41.676|Japanese:  ??????? ????? ?????? ????? ??????? ????? ??????? ?????
2016-08-30 16:50:41.676|Korean:  ??? ? ???? ???

If Log4J2 writes UTF-8 by default, why does the log file on the 2nd system contain only question marks? That is, the second system may (and probably is) missing fonts, but the log file itself on the 2nd system contains actual question marks when, using a hexdump tool, I would expect to see at least the binary for the UTF-8 characters in the file. Put another way, I can understand why an unknown character might render incorrectly, I just don't understand why the correct Unicode was not written to the file, if the process doing the writing is the JVM, which uses Unicode for characters.

like image 803
gknauth Avatar asked Aug 31 '16 18:08

gknauth


People also ask

What is configuration status in log4j2?

Configuration: the root element of a log4j2 configuration file; the status attribute represents the level at which internal log4j events should be logged. Appenders: this element contains a list of appenders; in our example, an appender corresponding to the System console is defined.

What is the difference between log4j and log4j2?

Community support: Log4j 1. x is not actively maintained, whereas Log4j 2 has an active community where questions are answered, features are added and bugs are fixed. Automatically reload its configuration upon modification without losing log events while reconfiguring.

What is rolling file Appender in log4j2?

Log4j2 RollingFileAppender is an OutputStreamAppender that writes log messages to files, following a configured triggering policy about when a rollover (backup) should occur. It also has a configured rollover strategy about how to rollover the file.


2 Answers

Did you try to enforce the UTF-8 charset for your Log4j Layout, inside your Log4j configuration file? For example, using PatternLayout:

<Configuration ...>
    ...
    <PatternLayout pattern="..." charset="UTF-8"/>
    ...
</Configuration>

See https://logging.apache.org/log4j/2.x/manual/layouts.html for more information on Log4j encoding issues.

like image 196
xav Avatar answered Sep 28 '22 01:09

xav


The default charset for the pattern layout in Log4j 2 is the system default charset, not UTF8. Other layouts may have a different default charset, this is documented in the manual page for each layout.

As indicated in the other answer, you can specify the charset in the layout's configuration.

like image 20
Remko Popma Avatar answered Sep 28 '22 03:09

Remko Popma