Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python subprocess introduces spaces

I've a python script that uses subprocess.Popen to execute Windows *.exe files. All EXEs except one produce expected output. When printed using print() the output in question includes whitespace between every character of the output.

This is how the output looks when executing the EXE in Windows command line:

C:\Python27>autorunsc.exe /accepteula

Sysinternals Autoruns v13.51 - Autostart program viewer
Copyright (C) 2002-2015 Mark Russinovich
Sysinternals - www.sysinternals.com


HKLM\System\CurrentControlSet\Control\Terminal Server\Wds\rdpwd\StartupPrograms
   rdpclip
     rdpclip
     RDP Clip Monitor
     Microsoft Corporation
     6.1.7601.17514
     c:\windows\system32\rdpclip.exe
     20/11/2010 11:22

HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Userinit
   C:\Windows\system32\userinit.exe

This is how it looks when printed in Python:

Sysinternals Autoruns v13.51 - Autostart program viewer
Copyright (C) 2002-2015 Mark Russinovich
Sysinternals - www.sysinternals.com


 H K L M \ S y s t e m \ C u r r e n t C o n t r o l S e t \ C o n t r o l \
 r m i n a l   S e r v e r \ W d s \ r d p w d \ S t a r t u p P r o g r a m
       r d p c l i p
           r d p c l i p
           R D P   C l i p   M o n i t o r
           M i c r o s o f t   C o r p o r a t i o n
           6 . 1 . 7 6 0 1 . 1 7 5 1 4
           c : \ w i n d o w s \ s y s t e m 3 2 \ r d p c l i p . e x e
           2 0 / 1 1 / 2 0 1 0   1 1 : 2 2

 H K L M \ S O F T W A R E \ M i c r o s o f t \ W i n d o w s   N T \ C u r
 n t V e r s i o n \ W i n l o g o n \ U s e r i n i t

We can clearly see the whitespace and what's interesting is that the first few lines don't include the spaces.

This is the code:

p = subprocess.Popen('autorunsc.exe /accepteula', stderr=subprocess.STDOUT,
stdout=subprocess.PIPE, shell=True)
a=p.stdout.read()
print(a)

Where does the spaces come from and how do I remove them?

like image 564
user3138929 Avatar asked Jan 11 '16 20:01

user3138929


1 Answers

Windows tools output format is encoded in UTF-16.

You have to decode output to correct encoding using str.decode method. Quoting docs:

str.decode([encoding[, errors]])

Decodes the string using the codec registered for encoding. encoding defaults to the default string encoding. errors may be given to set a different error handling scheme. The default is 'strict', meaning that encoding errors raise UnicodeError. Other possible values are 'ignore', 'replace' and any other name registered via codecs.register_error(), see section Codec Base Classes.

a=p.stdout.read().decode('UTF16')

For table of standard encodings you may refer to 7.8.3. Standard Encodings.

Since your output seems to have mixed encoding [as "spaces" (which are really 0x00 characters, not 0x20) exists only in part of output], you may want to preprocess or partition your string before performing decoding.

like image 90
Łukasz Rogalski Avatar answered Oct 13 '22 09:10

Łukasz Rogalski