Which of the three values, vsize, size and rss from ps
output is suitable for use in quick memory leak detection? For my purpose, if a process has been running for few days and its memory has kept increasing then that is a good enough indicator that it is leaking memory. I understand that a tool like valgrind should ultimately be used but its use is intrusive and so not always desirable.
For my understanding, I wrote a simple piece of C code that basically allocates 1 MiB of memory, frees it and then allocates 1 MiB again. It also sleeps before every step for 10 seconds giving me time to see output from ps -p <pid> -ovsize=,size=,rss=
. Here it is:
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#define info(args...) printf(args)
char* bytes(char* str, uint32_t size, uint32_t n)
{
char* unit = "B";
if (n > 1000) {
n /= 1000;
unit = "KB";
}
if (n > 1000) {
n /= 1000;
unit = "MB";
}
snprintf(str, size, "%u %s", n, unit);
return(str);
}
void* xmalloc(size_t size)
{
char msg[64];
size_t max = sizeof(msg);
void *p = NULL;
info("Allocating %s\n", bytes(msg, max, size));
p = malloc(size);
memset(p, '1', size);
return(p);
}
void* xfree(void* p, size_t size)
{
char msg[64];
size_t max = sizeof(msg);
info("Freeing %s\n", bytes(msg, max, size));
free(p);
return(NULL);
}
void nap()
{
const int dur = 10;
info("Sleeping for %d seconds\n", dur);
sleep(dur);
}
int main(void)
{
int err = 0;
size_t kb = 1024;
size_t block = 1024 * kb;
char* p = NULL;
nap();
p = xmalloc(block);
nap();
p = xfree(p, block);
nap();
p = xmalloc(block);
nap();
return(err);
}
Now, ps
was run every two seconds from a shell script that helped also print the measurements timestamps. Its output was:
# time vsize size rss
1429207116 3940 188 312
1429207118 3940 188 312
1429207120 3940 188 312
1429207122 3940 188 312
1429207124 3940 188 312
1429207126 4968 1216 1364
1429207128 4968 1216 1364
1429207130 4968 1216 1364
1429207132 4968 1216 1364
1429207135 4968 1216 1364
1429207137 3940 188 488
1429207139 3940 188 488
1429207141 3940 188 488
1429207143 3940 188 488
1429207145 5096 1344 1276
1429207147 5096 1344 1276
1429207149 5096 1344 1276
1429207151 5096 1344 1276
1429207153 5096 1344 1276
From the values above, and keeping in mind the descriptions given in the man page for ps(1)
, it seems to me that the best measure is vsize. Is this understanding correct? Note that the man page says that size is a measure of the total amount of dirty pages and rss the amount of pages in physical memory. These could very much become lower than the total memory used by the process.
These experiments were tried on Debian 7.8 running GNU/Linux 3.2.0-4-amd64.
Generally speaking the total virtual size (vsize
) of your process is the main measure of process size. rss
is just the portion that happens to be using real memory at the moment. size
is a measure of how many pages have actually been modified.
A constantly increasing vsize
, with relatively stable or cyclic size
and rss
values might suggest heap fragmentation or a poor heap allocator algorithm.
A constantly increasing vsize
and size
, with a relatively stable rss
might suggest a memory leak, heap fragmentation, or a poor heap allocator algorithm.
You will have to understand something of how a given program uses memory resources in order to use just these external measures of process resource usage to estimate whether it suffers from a memory leak or not.
Part of that involves knowing a little bit about how the heap is managed by the C library malloc()
and free()
routines, including what additional memory it might require internally to manage the list of active allocations, how it deals with fragmentation of the heap, and how it might release unused parts of the heap back to the operating system.
For example your test shows that both the total virtual size of the process, and the number of "dirty" pages it required, grew slightly larger the second time the program allocated the same amount of memory again. This probably shows some of the overhead of malloc()
, i.e. the amount of memory its own internal data structures required up to that point. It would have been interesting to see what happened if the program had done another free()
and sleep()
before exiting. It might also be instructive to modify your code so that it calls sleep()
between calling malloc()
and memset()
, and then observe the results from ps
.
So, a simple program which should only require a fixed amount of memory to run, or which allocates memory to do a specific unit of work and then should free all of that memory once that unit of work is completed, should show a relatively stable vsize
, assuming it doesn't ever process more than one unit of work at a time and have a "bad" pattern of allocation that would lead to heap fragmentation.
As you noted, a tool like valgrind
, along with intimate knowledge of the program's internal implementation, is necessary to show actual memory leaks and prove they are solely the responsibility of the program.
(BTW, you might want to simplify your code somewhat -- don't use unnecessary macros like info()
in particular, and for this type of example trying to be fancy with printing values in larger units, using extra variables to do size calculations, etc., is also more of an obfuscation than a help. Too many printfs also obfuscate the code -- use only those you need to see what step the program is at and to see values that are not known at compile time.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With