While this might look like a duplicate from other questions, let me explain why it's not.
I am looking to get a specific part of my application to degrade gracefully when a certain memory limit has been reached. I could have used criteria based on remaining available physical memory, but this wouldn't be safe, because the OS could start paging out memory used by my application before reaching the criteria, which would think there is still some physical memory left, and keep allocating, etc. For the same reason, I can't used the amount of physical memory currently used by the process, because as soon as the OS would start swapping me out, I would keep allocating as the OS pages memory so the number would not grow anymore.
For this reason, I chose a criteria based on the amount of memory allocated by my application, i.e. very close to virtual memory size.
This question (How to determine CPU and memory consumption from inside a process?) provides great ways of querying the amount of virtual memory used by the current process, which I THOUGHT was what I needed.
On Windows, I'm using GetProcessMemoryInfo()
and the PrivateUsage
field, which works great.
On Linux, I tried several things (listed below) that did not work. The reason why virtual memory usage does not work for me is because of something that happens with OpenCL context creation on NVidia hardware on Linux. The driver reserves a region of the virtual memory space big enough to hold all RAM, all swap and all video memory. My guess is it does so for unified address space and everything. But it also means that the process reports using enormous amounts of memory. On my system for instance, top will report 23.3 Gb in the VIRT column (12 Gb of RAM, 6 Gb of swap, 2 Gb of video memory, which gives 20 Gb reserved by the NVidia driver).
On OSX, by using task_info()
and the virtual_size
field, I also get a bigger than expected number (a few Gb for an app that takes not even close to 1 Gb on Windows), but not as big as Linux.
So here is the big question: how can I get the amount of memory allocated by my application? I know that this is a somewhat vague question (what does "allocated memory" means?), but I'm flexible:
What is really important is that the number grows with dynamic allocation (new, malloc, anything) and shrinks when the memory is released (which I know can be implementation-dependent).
Here are a couple of solutions I have tried and/or thought of but that would not work for me.
Read from /proc/self/status
This is the approach suggested by how-to-determine-cpu-and-memory-consumption-from-inside-a-process. However, as stated above, this returns the amount of virtual memory, which does not work for me.
Read from /proc/self/statm
Very slightly worst: according to http://kernelnewbies.kernelnewbies.narkive.com/iG9xCmwB/proc-pid-statm-doesnt-match-with-status, which refers to Linux kernel code, the only difference between those two values is that the second one does not substract reserved_vm
to the amount of virtual memory. I would have HOPED that reserved_vm
would include the memory reserved by the OpenCL driver, but it does not.
Use mallinfo()
and the uordblks
field
This does not seem to include all the allocations (I'm guessing the new
s are missing), since for an +2Gb growth in virtual memory space (after doing some memory-heavy work and still holding the memory), I'm only seeing about 0.1Gb growth in the number returned by mallinfo()
.
Read the [heap] section size from /proc/self/smaps
This value started at around 336,760 Kb and peaked at 1,019,496 Kb for work that grew virtual memory space by +2Gb, and then it never gets down, so I'm not sure I can't really rely on this number...
Monitor all memory allocations in my application
Yes, in an ideal world, I would have control over everybody who allocates memory. However, this is a legacy application, using tons of different allocators, some malloc
s, some new
s, some OS-specific routines, etc. There are some plug-ins that could do whatever they want, they could be compiled with a different compiler, etc. So while this would be great to really control memory, this does not work in my context.
Read the virtual memory size before and after the OpenCL context initialization
While this could be a "hacky" way to solve the problem (and I might have to fallback to it), I would really wish for a more reliable way to query memory, because OpenCL context could be initialized somewhere out of my control, and other similar but non-OpenCL specific issues could creep in and I wouldn't know about it.
So that's pretty much all I've got. There is one more thing I have not tried yet, because it only works on OSX, but it is to use the approach described in Why does mstats and malloc_zone_statistics not show recovered memory after free?, i.e. use malloc_get_all_zones()
and malloc_zone_statistics()
, but I think this might be the same problem as mallinfo()
, i.e. not take all allocations into account.
So, can anyone suggest a way to query memory usage (as vague of a term as this is, see above for precision) of a given process in Linux (and also OSX even if it's a different method)?
Linux-based operating systems use a virtual memory system. Any address referenced by a user-space application must be translated into a physical address. This is achieved through a combination of page tables and address translation hardware in the underlying computer system.
You can check memory of a process or a set of processes in human readable format (in KB or kilobytes) with pmap command. All you need is the PID of the processes you want to check memory usage of. As you can see, the total memory used by the process 917 is 516104 KB or kilobytes.
available memory in Linux is, that free memory is not in use and sits there doing nothing. While available memory is used memory that includes but is not limited to caches and buffers, that can be freed without the performance penalty of using swap space.
You can see the free and used physical memory i.e. RAM, the swap usage and the buffer used by the Linux kernel. All you have to do is to type free in the terminal and hit enter: You can see that the free command provides only the necessary info at a glance. This is probably one of the most common and the one that I use at first.
Some Linux commands provide information on both. Swap expands memory by providing disk space that can be used to house inactive pages that are moved to disk when physical memory fills up. One file that plays a role in memory management is /proc/kcore.
More memory can be “allocated” than can actually be delivered. If all programs try to cash in their RAM chips at once, the memory casino might go bust (and have to go cap in hand to the swap-space financiers). VmallocTotal: Total size of the vmalloc memory area. VmallocUsed: Amount of vmalloc area used.
Linux uses any spare RAM for things like file buffer space, to keep your computer running at optimum performance. It’s easy to get the impression that your system’s RAM has been consumed by some runaway process or memory leak, but that’s rarely the case. It’s usually just the kernel tenaciously doing its job in the background.
You can try and use information returned by getrusage()
:
#include <sys/time.h>
#include <sys/resource.h>
int getrusage(int who, struct rusage *usage);
struct rusage {
struct timeval ru_utime; /* user CPU time used */
struct timeval ru_stime; /* system CPU time used */
long ru_maxrss; /* maximum resident set size */
long ru_ixrss; /* integral shared memory size */
long ru_idrss; /* integral unshared data size */
long ru_isrss; /* integral unshared stack size */
long ru_minflt; /* page reclaims (soft page faults) */
long ru_majflt; /* page faults (hard page faults) */
long ru_nswap; /* swaps */
long ru_inblock; /* block input operations */
long ru_oublock; /* block output operations */
long ru_msgsnd; /* IPC messages sent */
long ru_msgrcv; /* IPC messages received */
long ru_nsignals; /* signals received */
long ru_nvcsw; /* voluntary context switches */
long ru_nivcsw; /* involuntary context switches */
};
If the memory information does not fit you purpose, observing the page fault counts can help monitor memory stress, which is what you intend to detect.
Have you tried a shared library interposer for Linux for section (5) above? So long as your application is not statically linking the malloc functions, you can interpose a new function between your program and the kernel malloc. I've used this tactic many times to collect stats on memory usage.
It does required setting LD_PRELOAD before running the program but no source or binary changes. It is an ideal answer in many cases.
Here is an example of a malloc interposer:
http://www.drdobbs.com/building-library-interposers-for-fun-and/184404926
You probably will also want to do calloc and free. Calls to new generally end up as a call to malloc so C++ is covered as well.
OS X seems to have similar capabilities but I have not tried it.
http://tlrobinson.net/blog/2007/12/overriding-library-functions-in-mac-os-x-the-easy-way-dyld_insert_libraries/
--Matt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With