Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the kernel concerned about issuing PHYSICALLY contiguous pages?

Tags:

linux-kernel

When a process requests physical memory pages from the Linux kernel, the kernel does its best to provide a block of pages that are physically contiguous in memory. I was wondering why it matters that the pages are PHYSICALLY contiguous; after all, the kernel can obscure this fact by simply providing pages that are VIRTUALLY contiguous.

Yet the kernel certainly tries its hardest to provide pages that are PHYSICALLY contiguous, so I'm trying to figure out why physical contiguity matters so much. I did some research &, across a few sources, uncovered the following reasons:

1) makes better use of the cache & achieves lower avg memory access times (GigaQuantum: I don’t understand: how?)

2) you have to fiddle with the kernel page tables in order to map pages that AREN’T physically contiguous (GigaQuantum: I don’t understand this one: isn’t each page mapped separately? What fiddling has to be done?)

3) mapping pages that aren’t physically contiguous leads to greater TLB thrashing (GigaQuantum: I don’t understand: how?)

Per the comments I inserted, I don't really understand these 3 reasons. Nor did any of my research sources adequately explain/justify these 3 reasons. Can anyone explain these in a little more detail?

Thanks! Will help me to better understand the kernel...

like image 252
GigaQuantum Avatar asked Nov 14 '11 15:11

GigaQuantum


People also ask

What is contiguous memory in terms of storing a variable data within a computer program?

Contiguous memory allocation contains two memory allocations: single partition and multi-partition. It contains Paging and Segmentation. The memory space is partitioned into fixed-sized partitions in the contiguous memory allocation, and each partition is only assigned to one process.

Are memory pages contiguous?

A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, described by a single entry in the page table.

What is contiguous virtual memory?

Virtual memory is a function provided by many operating systems where the operating system creates a virtual memory space that applications can access as if it were a single piece of contiguous memory. This virtual memory space can be a combination of actual physical memory as well as disk-based resources in concert.

How does the kernel allocate memory?

The kernel should allocate memory dynamically to other kernel subsystems and to user processes. The KMA (kernel memory allocation) API usually consists of two functions: void* malloc(size_t nbytes); void free(void* ptr); There are various issues associated with allocation of memory.


2 Answers

The main answer really lies in your second point. Typically, when memory is allocated within the kernel, it isn't mapped at allocation time - instead, the kernel maps as much physical memory as it can up-front, using a simple linear mapping. At allocation time it just carves out some of this memory for the allocation - since the mapping isn't changed, it has to already be contiguous.

The large, linear mapping of physical memory is efficient: both because large pages can be used for it (which take up less space for page table entries and less TLB entries), and because altering the page tables is a slow process (so you want to avoid doing this at allocation/deallocation time).

Allocations that are only logically linear can be requested, using the vmalloc() interface rather than kmalloc().

On 64 bit systems the kernel's mapping can encompass the entireity of physical memory - on 32 bit systems (except those with a small amount of physical memory), only a proportion of physical memory is directly mapped.

like image 122
caf Avatar answered Oct 11 '22 12:10

caf


Actually the behavior of memory allocation you describe is common for many OS kernels and the main reason is kernel physical pages allocator. Typically, kernel has one physical pages allocator that is used for allocation of pages for both kernel space (including pages for DMA) and user space. In kernel space you need continuos memory, because it's expensive (for in-kernel code) to map pages every time you need them. On x86_64, for example, it's completely worthless because kernel can see the whole address space (on 32bit systems there's 4G limitation of virtual address space, so typically top 1G are dedicated to kernel and bottom 3G to user-space).

Linux kernel uses buddy algorithm for page allocation, so that allocation of bigger chunk takes fewer iterations than allocation of smaller chunk (well, smaller chunks are obtained by splitting bigger chunks). Moreover, using of one allocator for both kernel space and user space allows the kernel to reduce fragmentation. Imagine that you allocate pages for user space by 1 page per iteration. If user space needs N pages, you make N iterations. What happens if kernel wants some continuos memory then? How can it build big enough continuos chunk if you stole 1 page from each big chunk and gave them to user space?

[update]

Actually, kernel allocates continuos blocks of memory for user space not as frequently as you might think. Sure, it allocates them when it builds ELF image of a file, when it creates readahead when user process reads a file, it creates them for IPC operations (pipe, socket buffers) or when user passes MAP_POPULATE flag to mmap syscall. But typically kernel uses "lazy" page loading scheme. It gives continuos space of virtual memory to user-space (when user does malloc first time or does mmap), but it doesn't fill the space with physical pages. It allocates pages only when page fault occurs. The same is true when user process does fork. In this case child process will have "read-only" address space. When child modifies some data, page fault occurs and kernel replaces the page in child address space with a new one (so that parent and child have different pages now). Typically kernel allocates only one page in these cases.

Of course there's a big question of memory fragmentation. Kernel space always needs continuos memory. If kernel would allocate pages for user-space from "random" physical locations, it'd be much more hard to get big chunk of continuos memory in kernel after some time (for example after a week of system uptime). Memory would be too fragmented in this case.

To solve this problem kernel uses "readahead" scheme. When page fault occurs in an address space of some process, kernel allocates and maps more than one page (because there's possibility that process will read/write data from the next page). And of course it uses physically continuos block of memory (if possible) in this case. Just to reduce potential fragmentation.

like image 43
Dan Kruchinin Avatar answered Oct 11 '22 12:10

Dan Kruchinin