I have a Hive query with CJK characters in a file like below:
SELECT * FROM tbl WHERE name LIKE '日本語%';
And the file is encoded in UTF-8:
> file -bi query.hql
text/plain; charset=utf-8
If I execute it with Hive CLI, I can get expected result:
> /path/to/hive -f query.hql
some results here
Now I want to execute this query from Java. So I wrote some code like:
String[] cmd = new String[]{"/bin/bash", "/my/script", "/path/to/query.hql", "/path/to/output.txt"};
ProcessBuilder pb = new ProcessBuilder(cmd);
...
pb.start();
...
And /my/script looks like:
HQL_FILE=$1
OUTPUT_FILE=$2
/path/to/hive -f "${HQL_FILE}" > "${OUTPUT_FILE}"
I ran my Java program but got no output. I checked Hive log file and it looks like an encoding issue.
If I run hive -f query.hql via shell, the CJK text logged correctly in hive log:
> cat /tmp/myuser/hive.log
2016-02-29 11:27:40,303 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: ... name LIKE '日本語%' ...
But if I run via above Java program, the log looks strange
> cat /tmp/myuser/hive.log
2016-02-29 11:29:41,104 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: ... name LIKE '???????%' ...
I've been investigating this problem for half day but could not find any useful information.
I appreciate if anyone can give me some advice.
PS:
Assuming that the Java program isn't writing the hql file itself, in the shell where the hive command works, run this command:
echo $LANG
You'll probably get something like en_US.UTF-8.
Take whatever value you get and modify your Java program to have this after you create the ProcessBuilder:
pb.environment().put("LANG", "en_US.UTF-8");
(Use whatever value you got instead of en_US.UTF-8)
If your Java program is writing the hql file itself, then there's something else to worry about too: when you open the file, you should specify UTF-8 encoding for output. How to do that will depend a bit on how you're opening the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With