Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c++: Find largest container in a program

I am trying to analyze a large C++ program. The program heavily uses STL container data structures like set, map, unordered set, unordered map, vector etc. Sometimes they are nested, e.g. map of sets.

I want to find out, in a particular run of the program, which containers hold the largest number of elements (i.e. largest value of size()). I can do minor edits to the program.

If there was a way to iterate over all the containers, or if there was a way to intercept the (size modifying) APIs of the containers, that might have been helpful. But these are not possible.

How would you approach this?

Addition: The platform in Linux, compiler is either g++ or clang++.

like image 704
Arun Avatar asked Feb 16 '16 15:02

Arun


2 Answers

If you can do minor edits, can you add every container to a big list of them?
Like this:

std::set<......> my_set;  // existing code
all_containers.add( &my_set ); // minor edit IMHO

Then you can call all_containers.analyse() that'd call size() on each of them and print the results.

You can use something like that:

struct ContainerStatsI {
  virtual int getSize() = 0;
};
template<class T> struct ContainerStats : ContainerStatsI {
  T* p_;
  ContainerStats( T* p ) : p_(p) {}
  int getSize() { return p->size(); }
};
struct ContainerStatsList {
  std::list<ContainerStatsI*> list_; // or some other container....
  template<class T> void add( T* p ) {
    list_.push_back( new ContainerStats<T>(p) );
  }
  // you should probably add a remove(T* p) as well
  void analyse() {
    for( ContainerStatsI* p : list_ ) {
      p->getSize(); // do something with what's returned here
    }
  }
};
like image 59
BitWhistler Avatar answered Sep 23 '22 13:09

BitWhistler


This method is useful when your project is really big and have very much instances of different containers. Advantage of method is that you does not need to modify big amount of code. It allows you to narrow type of container to find. This method help to diagnose situatation per contaier and per type.

It is possible to redefine template< class T > struct allocator. Possible to rename original allocator in std headers or modify it. Make it possible to do statistics for allocation and deallocation. You will know count and size per type of elements. But You can not know which instance of container that have elements.

Template template< class T > struct allocator placed at library header files. It is always exists and does not need to rebuild your development environment library, becouse as far as you know that template is not possible to compile into static library (exclude specialisation). Templates compiled always with your sources. But may be problem with precompiled headers. For project it is possible to regenerate or not use it, but for library it is need to check. Possible this is bottleneck of method, but it is simple to verify exists problem or not.

There is one empirical method that is not guarantee accuracy. When Your application is shutdown, the containers deallocated after its elements deallocated. So you can write statistics per container of parent type how much internal elements was at which type of container.

For example let we have:

vector<A>({1,2,3}) and map<string,B>({1,2}) and map<string,B>({1,2})

This will generate deallocation event list like this:

B, B, map<string,B>,
A, A, map<string,A>,
A, A, A, vector<A>,

So you can know that 3 elements A at vector<A>, 2 elements A at map<string,A>, and 2 elements A at map<string,A>

like image 34
oklas Avatar answered Sep 25 '22 13:09

oklas