It would be efficient for some purposes to allocate a huge amount of virtual space, and page in only pages that are accessed. Allocating a large amount of memory is instantaneous and does not actually grab pages:
char* p = new char[1024*1024*1024*256];
Ok, the above was wrong as pointed out because it's a 32 bit number.
I expect that new is calling malloc which calls sbrk, and that when I access a location 4Gb beyond the start, it tries to extend the task memory by that much?
Here is the full program:
#include <cstdint>
int main() {
constexpr uint64_t GB = 1ULL << 30;
char* p = new char[256*GB]; // allocate large block of virtual space
p[0] = 1;
p[1000000000] = 1;
p[2000000000] = 1;
}
Now, I get bad_alloc when attempting to allocate the huge amount, so obviously malloc won't work.
I was under the impression that mmap would map to files, but since this is suggested I am looking into it.
Ok, so mmap seems to support allocation of big areas of virtual memory, but it requires a file descriptor. Creating huge in-memory data structures could be a win but not if they have to be backed by a file:
The following code uses mmap even though I don't like the idea of attaching to a file. I did not know what number to put in to request in virtual memory, and picked 0x800000000. mmap returns -1, so obviously I'm doing something wrong:
#include <cstdint>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main() {
constexpr uint64_t GB = 1ULL << 30;
void *addr = (void*)0x8000000000ULL;
int fd = creat("garbagefile.dat", 0660);
char* p = (char*)mmap(addr, 256*GB, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
p[0] = 1;
p[1000000000] = 1;
p[2000000000] = 1;
close(fd);
}
Is there any way to allocate a big chunk of virtual memory and access pages sparsely, or is this not doable?
Is it possible to allocate large amount of virtual memory in linux?
Possibly. But you may need to configure it to be allowed:
The Linux kernel supports the following overcommit handling modes
0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slightly more memory in this mode. This is the default.
1 - Always overcommit. Appropriate for some scientific applications. Classic example is code using sparse arrays and just relying on the virtual memory consisting almost entirely of zero pages.
2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable amount (default is 50%) of physical RAM. Depending on the amount you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate.
Useful for applications that want to guarantee their memory allocations will be available in the future without having to initialize every page.
The overcommit policy is set via the sysctl `vm.overcommit_memory'.
So, if you want to allocate more virtual memory than you have physical memory, then you'd want:
# in shell
sysctl -w vm.overcommit_memory=1
RLIMIT_AS The maximum size of the process's virtual memory (address space) in bytes. This limit affects calls to brk(2), mmap(2) and mremap(2), which fail with the error ENOMEM upon exceeding this limit. Also automatic stack expansion will fail (and generate a SIGSEGV that kills the process if no alternate stack has been made available via sigaltstack(2)). Since the value is a long, on machines with a 32-bit long either this limit is at most 2 GiB, or this resource is unlimited.
So, you'd want:
setrlimit(RLIMIT_AS, {
.rlim_cur = RLIM_INFINITY,
.rlim_max = RLIM_INFINITY,
});
Or, if you cannot give the process permission to do this, then you can configure this persistently in /etc/security/limits.conf which will affect all processes (of a user/group).
Ok, so mmap seems to support ... but it requires a file descriptor. ... could be a win but not if they have to be backed by a file ... I don't like the idea of attaching to a file
You don't need to use a file backed mmap. There's MAP_ANONYMOUS for that.
I did not know what number to put in to request
Then use null. Example:
mmap(nullptr, 256*GB, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
That said, if you've configured the system as described, then new
should work just as well as mmap
. It'll probably use malloc
which will probably use mmap
for large allocations like this.
Bonus hint: You may benefit from taking advantage of using HugeTLB Pages.
The value of 256*GB
does not fit into a range of 32-bit integer type. Try uint64_t
as a type of GB
:
constexpr uint64_t GB = 1024*1024*1024;
or, alternatively, force 64-bit multiplication:
char* p = new char[256ULL * GB];
OT: I would prefer this definition of GB
:
constexpr uint64_t GB = 1ULL << 30;
As for the virtual memory limit, see this answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With