On Linux, the command ps aux outputs a list of processes with multiple columns for each stat. e.g.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
postfix 22611 0.0 0.2 54136 2544 ? S 15:26 0:00 pickup -l -t fifo -u
apache 22920 0.0 1.5 198340 16588 ? S 09:58 0:05 /usr/sbin/httpd
I want to be able to read this in using Python and split out each row and then each column so they can be used as values.
For the most part, this is not a problem:
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]
processes = ps.split('\n')
I can now loop through processes to get each row and split it out by spaces, for example
sep = re.compile('[\s]+')
for row in processes:
print sep.split(row)
However, the problem is that the last column, the command, sometimes has spaces in. In the example above this can be seen in command
pickup -l -t fifo -u
which would be split out as
['postfix', '22611', '0.0', '0.2', '54136', '2544', '?', 'S', '15:26', '0:00', 'pickup', '-l', '-t', 'fifo', '-u']
but I really want it as:
['postfix', '22611', '0.0', '0.2', '54136', '2544', '?', 'S', '15:26', '0:00', 'pickup -l -t fifo -u']
So my question is, how can I split out the columns but when it comes to the command column, keep the whole string as one list element rather than split out by spaces?
Use the second parameter to split
which specifies the maximum number of fields to split the string into. I guess you can find the number by counting the number of fields in the first line, i.e. the column titles.
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]
processes = ps.split('\n')
# this specifies the number of splits, so the splitted lines
# will have (nfields+1) elements
nfields = len(processes[0].split()) - 1
for row in processes[1:]:
print row.split(None, nfields)
Check out the python.psutils package.
psutil.process_iter
returns a generator which you can use to iterate over all processes.
p.cmdline
is a list of each Process object's cmdline arguments, separated just the way you want.
You can create a dictionary of pids vs (pid,cmdline,path)
with just one line and then use it anyway you want.
pid_dict = dict([(p.pid, dict([('pid',p.pid), ('cmdline',p.cmdline), ('path',p.path)]))
for p in psutil.process_iter()]))
Why don't you use PSI instead? PSI provides process information on Linux and other Unix variants.
import psi.process
for p in psi.process.ProcessTable().values(): …
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With