Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we need to use folly::fbvector instead of std::vector with allocator which reserve large uncommited area initially?

As known, if we push_back elements to the std::vector<>, and if the whole memory allocated in the vector is occupied, then std::vector<> reserves 2X of current size of memory (allocate new memory with 2X size), resize vector and copy old data to the new memory.

We can optimizing it, and Facebook done this in folly-library (FBVector is Facebook's drop-in implementation of std::vector. It has special optimizations for use with relocatable types and jemalloc https://github.com/facebook/folly/blob/master/folly/FBVector.h#L21 ).

I.e. when vector<> does not have enough memory to push_back new element, then we allocate more memory but not in 2 times more (in a different number of times: 1.3 - 1.5 times)

Description: https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

The graphical solver below reveals that choosing k = 1.5 (blue line) allows memory reuse after 4 reallocations, choosing k = 1.45 (red line) allows memory reuse after 3 reallocations, and choosing k = 1.3 (black line) allows reuse after only 2 reallocations.

enter image description here

But why do we need to use folly::fbvector<> instead of std::vector<> with our custom allocator which using VirtualAllocEx() (as shown here: For what do I need to use VirtualAlloc/VirtualAllocEx? ), or the same in linux https://stackoverflow.com/a/2782910/1558037 , where:

  • std::vector<>::reserve() - initially reserve large un-commited area of virtual address (allocate WMA, but doesn't allocate any PTE in PT), for example allocate initially 16 GB of virtual area and each time the lack of memory to commit memory (allocate PTEs - allocate physical area) equal 1 x SIZE of vector
  • std::vector<>::resize() - and then only commit a new batch of pages, allocate only new PTE in PT, without re-allocation of memory which already is used and without copying data from old memory to new

Overall:

The advantages of this approach with large un-commited area over the folly::vector<>: always we allocate only new part of memory and never copying old data.

The advantages of folly::vector<> approach over the std::vector<>: sometimes we do not need to allocate new memory, but copying the old data into the new memory should always.

like image 466
Alex Avatar asked Nov 09 '22 19:11

Alex


1 Answers

This is implementation-specific. GCC library does allocate twice as much, but Visual C++ does not. I believe, it uses 1.5 as well, not sure though.

I believe, folly is supposed to be operation-system agnostic, your approach is Windows/Linux-specific.

Object moving from old vector to the new one should be not that terrible if you choose your types carefully - that is, use std::unique_ptr as a data type.

like image 129
SergeyA Avatar answered Nov 14 '22 23:11

SergeyA