We try to use Java and UTF-8 on Windows. The application writes logs on the console, and we would like to use UTF-8 for the logs as our application has internationalized logs.
It is possible to configure the JVM so it generates UTF-8, using -Dfile.encoding=UTF-8
as arguments to the JVM. It works fine, but the output on a Windows console is garbled.
Then, we can set the code page of the console to 65001 (chcp 65001
), but in this case, the .bat
files do not work. This means that when we try to launch our application through our script (named start.bat), absolutely nothing happens. The command simple returns:
C:\Application> chcp 65001
Activated code page: 65001
C:\Application> start.bat
C:\Application>
But without chcp 65001
, there is no problem, and the application can be launched.
Any hints about that?
On Windows, the native encoding cannot be UTF-8 nor any other that could represent all Unicode characters. Windows sometimes replaces characters by similarly looking representable ones (“best-fit”), which often works well but sometimes has surprising results, e.g. alpha character becomes letter a.
Windows Terminal includes multiple tabs, panes, customizable shortcuts, support for Unicode and UTF-8 characters, and custom themes and styles. The terminal can support PowerShell, cmd, WSL, and other command-line tools.
The native character encoding of the Java programming language is UTF-16. A charset in the Java platform therefore defines a mapping between sequences of sixteen-bit UTF-16 code units (that is, sequences of chars) and sequences of bytes.
Try chcp 65001 && start.bat
The chcp
command changes the code page, and 65001 is the Win32 code page identifier for UTF-8 under Windows 7 and up. A code page, or character encoding, specifies how to convert a Unicode code point to a sequence of bytes or back again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With