How can I get subprocess.check_call to give me the raw binary output of a command, it seems to be encoding it incorrectly somewhere.
Details:
I have a command that returns text like this:
some output text “quote” ...
(Those quotes are unicode e2809d)
Here's how I'm calling the command:
f_output = SpooledTemporaryFile()
subprocess.check_call(cmd, shell=True, stdout=f_output)
f_output.seek(0)
output = f_output.read()
The problem is I get this:
>>> repr(output)
some output text ?quote? ...
>>> type(output)
<str>
(And if I call 'ord' the '?' I get 63.) I'm on Python 2.7 on Linux.
Note: Running the same code on OSX works correctly to me. The problem is when I run it on a Linux server.
Wow, this was the weirdest issue ever but I've fixed it!
It turns out that the program it was calling (a java program) was returning different encoding depending on where it was called from!
Dev osx machine, returns the characters fine, Linux server from command line, returns them fine, called from a Django app, nope turns into "?"s.
To fix this I ended up adding this argument to the command:
-Dfile.encoding=utf-8
I got that idea here, and it seems to work. There's also a way to modify the Java program internally to do that.
Sorry I blamed Python! You guys had the right idea.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With