Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to insert 100G integers into a vector on a 32-bit machine?

Say I have 100G integer numbers and want to insert them into vector<int> on a 32-bit machine, is it possible?

If I use a custom allocator to manage the storage strategy, how to guarantee the following operations are always valid:

vector<int> coll;
coll.insert(100G integers);
memcpy(coll.begin() + (1024 * 1024 * 1024 * 8), "Hello", 5);

Note that the C++ standard requires the objects stored in a vector must be consecutive. coll.begin() + (1024 * 1024 * 1024 * 8) may be an address of the hard disk.

like image 331
xmllmx Avatar asked Nov 30 '22 18:11

xmllmx


1 Answers

You can't use native pointers to address 100 G of integers directly, because they will consume 400 GB of memory; some 32-bit OS may address up to 2, 3 or 4 GB of RAM, most - up to 64 GB using PAE. Still, any 32-bit program will use 32-bit pointers able to address only up to 4 GB of RAM.

All standard STL implementations (libstdc++ from gcc, libcxx from llvm+clang, stlport from russia, microsoft stl from microsoft...) use native pointers inside STL collections, and native (32-bit) size_t as collection sizes.

You may try non-standard implementaton of STL, e.g. STXXL, http://stxxl.sourceforge.net/ (intro slides) which reimplements some STL collections using disk (HDD) as storage. With huge (you need the 400GB at least) fast SSD you may be able to fill the vector in several days or even tens of hours, if you are lucky.

The key features of STXXL are: Transparent support of parallel disks. The library provides implementations of basic parallel disk algorithms. STXXL is the only external memory algorithm library supporting parallel disks. The library is able to handle problems of very large size (tested to up to dozens of terabytes).

But modern versions of STXXL are not supported for 32-bit platforms; I can't say, will any older version work on 32-bit platform with so huge data... It uses some parts of STL, and if there are any size_t sized arguments, your task will fail...

like image 84
osgx Avatar answered Dec 04 '22 11:12

osgx