A very similar question was posted five years ago at coderanch, but didn't get any answer: http://www.coderanch.com/t/463055/java-io/java/handle-charset-ProcessBuilder
Here is my code:
new ProcessBuilder("/bin/bash", "-c", "echo тест | hexdump").start();
Here is what I'm expecting:
0000000 d1 82 d0 b5 d1 81 d1 82
This is what I'm getting on another Linux machine:
0000000 f2 e5 f1 f2
What is it? How to resolve?
The encoding in ProcessBuilder refers to the file.encoding system property.
For example, when you have the following source:
import java.io.*;
class Main {
public static void main(String args[]) throws Exception {
ProcessBuilder pb = new ProcessBuilder("/bin/bash", "-c", "echo -n тест | hexdump -C");
Process p = pb.start();
InputStream is = p.getInputStream();
printInputStream(is);
p.waitFor();
}
public static void printInputStream(InputStream is) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(is));
try {
for (;;) {
String line = br.readLine();
if (line == null) break;
System.out.println(line);
}
} finally {
br.close();
}
}
}
You will get f2 e5 f1 f2 if the encoding like Windows-1251 is the default encoding in your runtime environment (or you have set Windows-1251 to file.encoding explicitly).
$ java -Dfile.encoding=Windows-1251 Main
00000000 f2 e5 f1 f2 |....|
00000004
You should set UTF-8 to file.encoding explicitly if you want to get d1 82 d0 b5 d1 81 d1 82 as a result.
$ java -Dfile.encoding=UTF-8 Main
00000000 d1 82 d0 b5 d1 81 d1 82 |........|
00000008
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With