I have two python scripts:object_generator.py which pickles a given object and prints it. Another script object_consumer.py picks the output of the first script through a subprocess.communicate and tries to unpickle it using pickle.loads. I am having trouble making this simple scenario work. This is my code:
import pickle
import base64
o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}
d = pickle.dumps(o)
print(d)
#Various Approaches I had tried, none of which worked. Ignore this part.
#s = base64.b64decode(d)
#encoded_str = str(d).encode('ascii')
#print('encoded str is :')
#print(encoded_str)
#decoded_str = encoded_str.decode('ascii')
#print('decoded str is :')
#print(decoded_str)
#unpickled_obj = pickle.loads(bytes(decoded_str))
#print(unpickled_obj)
#print(type(d))
#print(codecs.decode(d))
import pickle
import subprocess
import os
dr = '"' + os.path.dirname(os.path.abspath(__file__)) + '\\object_generator.py"'
cmd = 'python -u ' + dr
proc = subprocess.Popen(cmd,stdout=subprocess.PIPE)
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
# 'out' at this point is something like this :
# b"b'\\x80\\x03}q\......x05K\\x03u.'\r\n"
# DO SOMETHING WITH outs to get back the bytes which can then be
# unpickled using pickle.loads
obj = pickle.loads(outs)
print(obj)
Clearly, I need to strip off the trailing \r\n which is easy but what should be done next?
There are a couple of issues going on here. First, you're printing a bytes
object in object_generator.py
. In Python 3.x, that's going to result in str(obj)
being called, which means b'yourbyteshere'
gets printed. You don't want the leading b'
or trailing '
. To fix that, you need to encode the bytes
object as a string. pickle
uses the 'latin-1'
encoding, so we can use that to decode the bytes
object to a str
. The other issue is that the encoding Windows uses by default for sys.stdout
doesn't actually support printing decoded pickle
strings. So, we need to change the default encoding for sys.stdout
* to 'latin-1'
, so the string will make it to the parent process with the correct encoding.
import pickle
import base64
import codecs
o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}
d = pickle.dumps(o)
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding='latin-1')
print(d.decode('latin-1'), end='', flush=True) # end='' will remove that extra \r\n
Make those changes, and it should work fine.
Edit:
Another option would be to set the PYTHONIOENCODING
environment variable to 'latin-1'
from the parent process:
env = os.environ.copy()
env['PYTHONIOENCODING'] = 'latin-1'
proc = subprocess.Popen(['python3', 'async2.py'] ,stdout=subprocess.PIPE, env=env)
* See this question for more info on changing the sys.stdout
encoding in Python 3. Both approaches I show here are mentioned there.
i don't suggest you using pickle between your main file and an unknow external one since it require the original classes to be live and it's also slow.
I used marshall module, hope this will save you time: https://github.com/jstar88/pyCommunicator
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With