I was recently delving into the details of Linux's memory management as I want to implement something similar for my own toy kernel, so I was hoping if someone who's familiar with the details could help me understand one thing. Apparently the physical memory manager is a buddy algorithm, which is further specialised to return blocks of pages of a particular order (0 to 9, with 0 being just a single page). For each order the blocks are stored as a linked list. Say if a block of order 5 is requested but is not found on the list of order 5 blocks, the algorithm searches for a block in order 6, splits it into two, gives the requested half and moves the other half an order lower (as it is half in size). What I don't get is how the kernel stores these structures, or how it allocates space for them. Since for order 0 pages you would need 1M entries (each is a 4KiB page), does it mean that the kernel allocates 1MiB * sizeof(struct page)? What about the blocks of order 1 and above? Does the kernel reuse allocated blocks by marking them as a higher order, and when it needs to split it in two just return the block and get one that is unused?
Linux-based operating systems use a virtual memory system. Any address referenced by a user-space application must be translated into a physical address. This is achieved through a combination of page tables and address translation hardware in the underlying computer system.
Techopedia Explains Memory AllocationPrograms and services are assigned with a specific memory as per their requirements when they are executed. Once the program has finished its operation or is idle, the memory is released and allocated to another program or merged within the primary memory.
The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized. If size is 0, then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
The slab allocator aims to to cache the freed object so that the basic structure is preserved between uses [ Bon94 ]. The slab allocator consists of a variable number of caches that are linked together on a doubly linked circular list called a cache chain.
What I don't get is how the kernel stores these structures, or how it allocates space for them. Since for order 0 pages you would need 1M entries (each is a 4KiB page), does it mean that the kernel allocates 1MiB * sizeof(struct page)?
Initialization of zones is done by calling paging_init()
(arch/x86/mm/init_32.c; some descriptions - https://www.kernel.org/doc/gorman/html/understand/understand005.html 2.3 Zone Initialisation and http://repo.hackerzvoice.net/depot_madchat/ebooks/Mem_virtuelle/linux-mm/vminit.html Initializing the Kernel Page Tables) from setup_arch()
via (native_pagetable_init()
and indirect call 1166 x86_init.paging.pagetable_init();
):
690 /*
691 * paging_init() sets up the page tables - note that the first 8MB are
692 * already mapped by head.S.
...*/
697 void __init paging_init(void)
698 {
699 pagetable_init();
...
711 zone_sizes_init();
712 }
pagetable_init()
creates kernel page tables in swapper_pg_dir
array of 1024 pgd_t
s.
zone_sizes_init()
actually defines zones of physical memory and calls free_area_init_nodes()
to initialize them with actual work done (for each NUMA node for_each_online_node(nid) {...}
) in free_area_init_node()
which calls three functions:
calculate_node_totalpages()
prints page counts for every node in dmesgalloc_node_mem_map()
does actual job of allocating struct page
for every physical page in this node; memory for them is allocated by bootmem allocator doc1 doc2 (you can see its debug with bootmem_debug=1
kernel boot option):4936 size = (end - start) * sizeof(struct page);
4937 map = alloc_remap(pgdat->node_id, size);
if (!map) map = memblock_virt_alloc_node_nopanic(size, pgdat->node_id);
free_area_init_core()
(with filling of bitmaps in struct zone
). Functionality of free_area_init_core
described for older kernels in http://repo.hackerzvoice.net/depot_madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html#INITIALIZE as:
free_area_init_core()
The memory map is built, and the freelists and buddy bitmaps initialized, in free_area_init_core().
Free lists of orders in each zone are initialized and orders are marked as having no any free page: free_area_init_core()
-> init_currently_empty_zone()
-> zone_init_free_lists
:
4147 static void __meminit zone_init_free_lists(struct zone *zone)
4148 {
4149 unsigned int order, t;
4150 for_each_migratetype_order(order, t) {
4151 INIT_LIST_HEAD(&zone->free_area[order].free_list[t]);
4152 zone->free_area[order].nr_free = 0;
4153 }
4154 }
PS: There is init() in kernel, it is called start_kernel()
, and LXR (Linux cross-reference) will help you to navigate between functions (I posted links to lxr.free-electrons.com, but there are several online LXRs):
501 asmlinkage __visible void __init start_kernel(void)
...
528 boot_cpu_init();
529 page_address_init();
530 pr_notice("%s", linux_banner);
531 setup_arch(&command_line);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With