Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Track Memory Usage in C++ and evaluate memory consumption

I came across the following problem with my code: I was using Valgrind and gperftools to perform heap checking and heap profiling to see if I release all the memory that I allocate. The output of these tools look good and it seems I'm not loosing memory. However, when I'm looking at top and the output of ps I'm confused because this basically does not represent what I'm observing with valgrind and gperftools.

Here are the numbers:

  • Top reports: RES 150M
  • Valgrind (Massif) reports: 23M peak usage
  • gperftools Heap Profiler reports: 22.7M peak usage

My question is now, where does the difference come from? I tried as well to track the stack usage in Valgrind but without any success.

Some more details:

  • The process is basically loading data from mysql via the C api to an in-memory storage
  • Performing a leak check and breaking shortly after the loading is done, shows a definitive lost of 144 bytes, and 10M reachable, which fits the amount that is currently allocated
  • The library performs no complex IPC, it starts a few threads but only one of the threads is executing the work
  • It does not load other complex system libraries
  • the PSS size from /proc/pid/smaps corresponds to the RES size in TOP and ps

Do you have any ideas, where this difference in reported memory consumption comes from? How can I validate that my program is behaving correctly? Do you have any ideas how I could further investigate this issue?

like image 797
grundprinzip Avatar asked Nov 21 '12 10:11

grundprinzip


People also ask

How do I track memory usage?

To open up Resource Monitor, press Windows Key + R and type resmon into the search box. Resource Monitor will tell you exactly how much RAM is being used, what is using it, and allow you to sort the list of apps using it by several different categories. However, it doesn't offer much else.

Which profiling will analyze the memory usage of the application?

When the Diagnostic Tools window appears, choose the Memory Usage tab, and then choose Heap Profiling.

How do I check virtual memory usage?

To start Performance Monitor, click Start, click Control Panel, click Administrative Tools, and then double-click Performance Monitor. Here is a summary of some important counters and what they tell you: Memory, Committed Bytes: This counter is a measure of the demand for virtual memory.

How do I check my RAM C?

The part number of a memory module is usually printed on its body. You can search for this part number to know more about the CAS Latency of this RAM. Alternatively, sometimes, RAM timings may also be printed. It can look somewhat like this – CL15-18-18-36.


1 Answers

Finally I was able to solve the problem and will happily share my findings. In general the best tool to evaluate memory consumption of a program from my perspective is the Massif tool from Valgrind. it allows you to profile the heap consumption and gives you a detailed analysis.

To profile the heap of your application run valgrind --tool=massif prog now, this will give you basic access to all information about the typical memory allocation functions like malloc and friends. However, to dig deeper I activated the option --pages-as-heap=yes which will then report even the information about the underlaying system calls. To given an example here is something from my profiling session:

 67  1,284,382,720      978,575,360      978,575,360             0            0
100.00% (978,575,360B) (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
->87.28% (854,118,400B) 0x8282419: mmap (syscall-template.S:82)
| ->84.80% (829,849,600B) 0x821DF7D: _int_malloc (malloc.c:3226)
| | ->84.36% (825,507,840B) 0x821E49F: _int_memalign (malloc.c:5492)
| | | ->84.36% (825,507,840B) 0x8220591: memalign (malloc.c:3880)
| | |   ->84.36% (825,507,840B) 0x82217A7: posix_memalign (malloc.c:6315)
| | |     ->83.37% (815,792,128B) 0x4C74F9B: std::_Rb_tree_node<std::pair<std::string const, unsigned int> >* std::_Rb_tree<std::string, std::pair<std::string const, unsigned int>, std::_Select1st<std::pair<std::string const, unsigned int> >, std::less<std::string>, StrategizedAllocator<std::pair<std::string const, unsigned int>, MemalignStrategy<4096> > >::_M_create_node<std::pair<std::string, unsigned int> >(std::pair<std::string, unsigned int>&&) (MemalignStrategy.h:13)
| | |     | ->83.37% (815,792,128B) 0x4C7529F: OrderIndifferentDictionary<std::string, MemalignStrategy<4096>, StrategizedAllocator>::addValue(std::string) (stl_tree.h:961)
| | |     |   ->83.37% (815,792,128B) 0x5458DC9: var_to_string(char***, unsigned long, unsigned long, AbstractTable*) (AbstractTable.h:341)
| | |     |     ->83.37% (815,792,128B) 0x545A466: MySQLInput::load(std::shared_ptr<AbstractTable>, std::vector<std::vector<ColumnMetadata*, std::allocator<ColumnMetadata*> >*, std::allocator<std::vector<ColumnMetadata*, std::allocator<ColumnMetadata*> >*> > const*, Loader::params const&) (MySQLLoader.cpp:161)
| | |     |       ->83.37% (815,792,128B) 0x54628F2: Loader::load(Loader::params const&) (Loader.cpp:133)
| | |     |         ->83.37% (815,792,128B) 0x4F6B487: MySQLTableLoad::executePlanOperation() (MySQLTableLoad.cpp:60)
| | |     |           ->83.37% (815,792,128B) 0x4F8F8F1: _PlanOperation::execute_throws() (PlanOperation.cpp:221)
| | |     |             ->83.37% (815,792,128B) 0x4F92B08: _PlanOperation::execute() (PlanOperation.cpp:262)
| | |     |               ->83.37% (815,792,128B) 0x4F92F00: _PlanOperation::operator()() (PlanOperation.cpp:204)
| | |     |                 ->83.37% (815,792,128B) 0x656F9B0: TaskQueue::executeTask() (TaskQueue.cpp:88)
| | |     |                   ->83.37% (815,792,128B) 0x7A70AD6: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
| | |     |                     ->83.37% (815,792,128B) 0x6BAEEFA: start_thread (pthread_create.c:304)
| | |     |                       ->83.37% (815,792,128B) 0x8285F4B: clone (clone.S:112)
| | |     |                         
| | |     ->00.99% (9,715,712B) in 1+ places, all below ms_print's threshold (01.00%)
| | |     
| | ->00.44% (4,341,760B) in 1+ places, all below ms_print's threshold (01.00%)

As you can see ~85% of my memory allocation come from a single branch and the question is now why the memory consumption is so high, if the original heap profiling showed a normal consumption. If you look at the example you will see why. For allocation I used posix_memalign to make sure allocations happen to useful boundaries. This allocator was then passed down from the outer class to the inner member variables (a map in this case) to use the allocator for heap allocation. However, the boundary I choose was too large - 4096 - in my case. This means, you will allocate 4b using posix_memalign but the system will allocate a full page for you to align it correctly. If you now allocate many small values you will end up with lots of unused memory. This memory will not be reported by normal heap profiling tools since you allocate only a fraction of this memory, but the system allocation routines will allocate more and hide the rest.

To solve this problem, I switched to a smaller boundary and thus could drastically reduce the memory overhead.

As a conclusion of my hours spent in front of Massif & Co. I can only recommend to use this tool for deep profiling since it gives you a very good understanding of what is happening and allows tracking errors easily. For the use of posix_memalign the situation is different. There are cases where it is really necessary, however, for most cases you will just fine with a normal malloc.

like image 178
grundprinzip Avatar answered Sep 30 '22 21:09

grundprinzip