This is on Linux OS. App is written in C++ with ACE library.
I am suspecting that one of the thread in the process is getting blocked for unusually long time(5 to 40 seconds) sometimes. The app runs fine most of the times except couple times a day it has this issue. There are other similar 5 apps running on the box which are also I/O bound due to heavy socket incoming data.
I would like to know if there is any thing I can do programatically to see if the thread/process are getting their time slice.
If a process is being starved out, self monitoring for that process would not be that productive. But, if you just want that process to notice it hasn't been run in a while, it can call times
periodically and compare the relative difference in elapsed time with the relative difference in scheduled user time (you would sum the tms_utime
and tms_cutime
fields if you want to count waiting for children as productive time, and you would sum in the tms_stime
and tms_cstime
fields if you count kernel time spent on your behalf to be productive time). For thread times, the only way I know of is to consult the /proc
filesystem.
A high priority external process or high priority thread could externally monitor processes (and threads) of interest by reading the appropriate /proc/<pid>/stat
entries for the process (and /proc/<pid>/task/<tid>/stat
for the threads). The user times are found in the 14th and 16th fields of the stat
file. The system times are found in the 15th and 17th fields. (The field positions are accurate for my Linux 2.6 kernel.)
Between two time points, you determine the amount of elapsed time that has passed (a monitor process or thread would usually wake up at regular intervals). Then the difference between the cumulative processing times at each of those time points represents how much time the thread of interest got to run during that time. The ratio of processing time to elapsed time would represent the time slice.
One last bit of info: On Linux, I use the following to obtain the tid
of the current thread for examining the right task
in the /proc/<pid>/task/
directory:
tid = syscall(__NR_gettid);
I do this, because I could not find the gettid
system call actually exported by any library on my system, even though it was documented. But, it might be available on yours.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With