Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently retrieving stats for all running processes using `psutils`

I am buidling a utility which retrieves information for all the running processes on the OS (Centos 7) using Python 3.6.5.

I created the following function for that matter, using psutil:

def get_processes(self):
    fqdn = self.get_FQDN()
    process_infos = list()
    for proc in psutil.process_iter():
        proc_info = dict()
        with proc.oneshot():
            proc_info["pid"] = proc.pid
            proc_info["ppid"] = proc.ppid()
            proc_info["name"] = proc.name()
            proc_info["exe"] = proc.exe()  # Requires root access for '/proc/#/exe'
            proc_info["computer"] = fqdn
            proc_info["cpu_percent"] = proc.cpu_percent()

            mem_info = proc.memory_info()
            proc_info["mem_rss"] = mem_info.rss

            proc_info["num_threads"] = proc.num_threads()
            proc_info["nice_priority"] = proc.nice()
        process_infos.append(proc_info)
    return process_infos

I have a one second iteration which calls this function, and after adding it I noticed that my application CPU consumption worsened from ~1% to ~10%. The profiler indicated to me that most of my CPU time is wasted within the psutil's function _parse_stat_file which parses the content of the /proc/<pid>/stat file.

According to psutils documentation, it is recommended to use oneshot() function for more efficient collection, but as you can see I already use it.

Is there something I am doing wrong here? Or am I doomed to psutils bad performance? If so, do you know other utility that might solve my problem more efficiently?

like image 899
Matan Bakshi Avatar asked Mar 05 '23 14:03

Matan Bakshi


1 Answers

psutil author here.

I doubt other tools can do a significantly better job. Reading /proc/pid/stat is the only way for a user space app to get those process info so all of them (ps, top, etc.) basically do the same thing: read the file and parse it. As such I don’t expect one can be significantly faster than another one.

By using oneshot() you are already telling psutil to avoid reading that file more than once so there likely is nothing you can do to speed that up even further. Consider that you are asking 7 stats for all running process every second so it is natural to expect some kind of overhead. I wouldn’t be surprised if top had a similar CPU consumption.

like image 185
Giampaolo Rodolà Avatar answered May 10 '23 19:05

Giampaolo Rodolà