Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine if an allocation via malloc() is backed by a huge page

I understand pretty well how transparent hugepages work, and that any allocation, such as those performed by malloc may be satisfied by a huge page.

What I'd like to know, is if there is any check I can make (possibly heuristic) after an allocation to determine if the memory is backed by a huge page.

like image 516
BeeOnRope Avatar asked Oct 21 '22 02:10

BeeOnRope


People also ask

How do I allocate a large page?

Allocating huge pages of a specific size Some platforms support multiple huge page sizes. To allocate huge pages of a specific size, precede the huge pages boot command parameters with a huge page size selection parameter hugepagesz=<size> .

How big can malloc allocate?

The malloc() function reserves a block of storage of size bytes. Unlike the calloc() function, malloc() does not initialize all elements to 0. The maximum size for a non-teraspace malloc() is 16711568 bytes.

What are huge pages in Linux?

HugePages is a feature integrated into the Linux kernel 2.6. Enabling HugePages makes it possible for the operating system to support memory pages greater than the default (usually 4 KB).

How do I allocate more large pages in Linux?

Open the /etc/sysctl. conf file with root permissions and add the line to it manually. Reboot for these changes to take effect. Once again, check your system's allocation of huge pages in the /proc/meminfo virtual file.


1 Answers

You can determine the exact status of any page, including whether it is backed by a transparent (or non-transparent) hugepage by looking up the "pfn" (page frame number) in the /proc/kpageflags file. You get the pfn for a page by reading from the /proc/$PID/pagemap file for your process, which is indexed by virtual address.

Unfortunately, both the pfn value from pagemap1 and the entire /proc/kpageflags file are accessible only to root users. Still if you can run your process as root at least in the testing or benchmarking scenario you are interested in, this works well.

I wrote a small library called page-info which does the relevant parsing for you. Give it a range of memory and it will return you info on each page, including whether it is present in memory, backed by a hugepage, etc.

For example, running the included test process as sudo ./page-info-test THP gives the following output:

PAGE_SIZE = 4096, PID = 18868
       size memset       FLAG     SET   UNSET UNAVAIL
   0.25 MiB BEFORE        THP       0       1      64
   0.25 MiB AFTER         THP       0      65       0
   0.50 MiB BEFORE        THP       0       1     128
   0.50 MiB AFTER         THP       0     129       0
   1.00 MiB BEFORE        THP       0       1     256
   1.00 MiB AFTER         THP       0     257       0
   2.00 MiB BEFORE        THP       0       1     512
   2.00 MiB AFTER         THP       0     513       0
   4.00 MiB BEFORE        THP       0       1    1024
   4.00 MiB AFTER         THP     512     513       0
   8.00 MiB BEFORE        THP       0       1    2048
   8.00 MiB AFTER         THP    1536     513       0
  16.00 MiB BEFORE        THP       0       1    4096
  16.00 MiB AFTER         THP    3584     513       0
  32.00 MiB BEFORE        THP       0       1    8192
  32.00 MiB AFTER         THP    7680     513       0
  64.00 MiB BEFORE        THP       0       1   16384
  64.00 MiB AFTER         THP   15872     513       0
 128.00 MiB BEFORE        THP       0       1   32768
 128.00 MiB AFTER         THP   32256     513       0
 256.00 MiB BEFORE        THP       0       1   65536
 256.00 MiB AFTER         THP   65024     513       0
 512.00 MiB BEFORE        THP       0       1  131072
 512.00 MiB AFTER         THP  124416    6657       0
1024.00 MiB BEFORE        THP       0       1  262144
1024.00 MiB AFTER         THP       0  262145       0
DONE

The UNAVAIL column means that no information about the mapping was available - usually because the page has never been accesses and so isn't yet backed by any page at all. You can see that for these "largeish" allocations only a single page is mapped in following the allocation, since we haven't touched the memory.

The AFTER rows are the same information after calling memset() on the entire allocation, which causes all pages to be physically allocated. Here we can see that no allocations are backed by transparent hugepages until we hit allocations of 4 MiB, at which point the majority of each allocation is backed by THP, except for 513 pages (which turn out to be at the edges of the allocated region). At 512 MiB the system starts running out of available hugepages but still satisfies most of the allocation, but at 1024 MiB the entire allocation is satisfied with small pages.

This library isn't production ready so don't use it for anything critical (e.g., some failures simply call exit()). Contributions welcome.


1 Since kernel 4.0 approximately, before that the pfn was accessible to non-root user processes. From 4.0 to 4.1 or thereabouts, the entire pagemap was off-limits to non-root processes, but since then the file is again available but with the pfn masked out (it will always appear as zero).

like image 172
BeeOnRope Avatar answered Oct 24 '22 12:10

BeeOnRope