I'm trying to memory map a huge file (approx. 100GB) in order to store a B-Tree with billions of key-value pairs. The memory is to small to keep all data in memory therefore I'm trying to map a file from disk and instead of using malloc I return and increment a pointer to the mapped region.
#define MEMORY_SIZE 300000000
unsigned char *mem_buffer;
void *start_ptr;
void *my_malloc(int size) {
unsigned char *ptr = mem_buffer;
mem_buffer += size;
return ptr;
}
void *my_calloc(int size, int object_size) {
unsigned char *ptr = mem_buffer;
mem_buffer += (size * object_size);
return ptr;
}
void init(const char *file_path) {
int fd = open(file_path, O_RDWR, S_IREAD | S_IWRITE);
if (fd < 0) {
perror("Could not open file for memory mapping");
exit(1);
}
start_ptr = mmap(NULL, MEMORY_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
mem_buffer = (unsigned char *) start_ptr;
if (mem_buffer == MAP_FAILED) {
perror("Could not memory map file");
exit(1);
}
printf("Successfully mapped file.\n");
}
void unmap() {
if (munmap(start_ptr, MEMORY_SIZE) < 0) {
perror("Could not unmap file");
exit(1);
}
printf("Successfully unmapped file.\n");
}
main method:
int main(int argc, char **argv) {
init(argv[1]);
unsigned char *arr = (unsigned char *) my_malloc(6);
arr[0] = 'H';
arr[1] = 'E';
arr[2] = 'L';
arr[3] = 'L';
arr[4] = 'O';
arr[5] = '\0';
unsigned char *arr2 = (unsigned char *) my_malloc(5);
arr2[0] = 'M';
arr2[1] = 'I';
arr2[2] = 'A';
arr2[3] = 'U';
arr2[4] = '\0';
printf("Memory mapped string1: %s\n", arr);
printf("Memory mapped string2: %s\n", arr2);
struct my_btree_node *root = NULL;
insert(&root, arr, 10);
insert(&root, arr2, 20);
print_tree(root, 0, false);
// cin.ignore();
unmap();
return EXIT_SUCCESS;
}
The problem is that I receive Cannot allocate memory
(errno is 12) if the requested size is bigger than the actual memory or a Segmentation fault
if the requested space is outside of the mapped region. I was told that it is possible to map files bigger than the actual memory.
Will the system manage the file by itself or am I responsible for mapping only the amount of free memory and when accessing further space I have to unmap and map to another offset.
Thank you
EDIT
OS: Ubuntu 14.04 LTS x86_64
bin/washingMachine: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=9dc831c97ce41b0c6a77b639121584bf76deb47d, not stripped
First, make sure that you are running on a 64-bit CPU in 64-bit mode. On a 32-bit CPU, the address space of your process is only 232 bytes (four gigabytes) large, and there's no way to fit 100 GB into that all at once -- there's simply not enough addresses. (In addition, a big chunk of that address space will already be used by other mappings or reserved by the kernel.)
Second, there might be trouble even if the mapping fits into the address space. Memory that is mapped into your process (this also includes e.g. your program's code and data segments, and ditto for shared libraries) is split up into units of pages (usually 4 KB large each on x86), where each page requires some metadata in the kernel and the MMU. This is another resource that might be exhausted when creating huge memory mappings.
As suggested in Mmap() an entire large file, you could try using MAP_SHARED
. That might allow the kernel to allocate memory for the mapping lazily as pages in it are accessed, since it knows it can always swap a page out to the file on disk if there's a memory shortage. With MAP_PRIVATE
, the kernel needs to allocate a new page each time a page is modified (since the change should not be carried through), which would not be safe to do lazily in case the system runs out of memory and swap.
You might also need to pass MAP_NORESERVE
to mmap()
when allocating more memory than there is physical memory, or set /proc/sys/vm/overcommit_memory
(see proc(5)
) to 1 (which is a bit ugly due to being system-wide though).
On my system, which is similar to yours with 8 GB of RAM and 8 GB of swap, MAP_SHARED
alone is sufficient to mmap()
a 40 GB file. MAP_PRIVATE
along with MAP_NORESERVE
works too.
If that doesn't work, then you're likely running into an MMU-related limit. Many modern CPU architectures support huge pages, which are pages larger than the default page size. The point of huge pages is that you need fewer pages to map the same amount of memory (assuming a large mapping), which reduces the amount of metadata and could make address translation and context switches more efficient. The downside of huge pages is decreased mapping granularity and increased wastage (internal fragmentation) when only a small portion of a page is used.
Pairing MAP_SHARED
and some random file with huge pages is unlikely to work by the way (in case MAP_SHARED
isn't enough to fix the problem). The file would need to be in a hugetlbfs.
Passing MAP_HUGETLB
to mmap()
requests allocation using huge pages (though it might just be for anonymous mappings, where it also seems that huge pages should be automatic on many systems nowadays). You might potentially also need to muck around with /proc/sys/vm/nr_hugepages
and /proc/sys/vm/nr_overcommit_hugepages
-- see this thread and the
Documentation/vm/hugetlbpage.txt file in the kernel sources.
Beware of alignment issues when writing your own memory allocator by the way. I hope this isn't too pluggish, but see this answer.
As a side note, any memory you access from a memory-mapped file must actually exist in the file. If the file is smaller than the mapping and you want to be able to access the "extra" memory still, you could grow the file first using ftruncate(2)
. (This might not actually increase its size much on disk if the filesystem supports sparse files with file holes.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With