Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

May i know in Linux kernel what is the purpose of GFP_HARDWALL flag?

GFP flags are used for memory allocation. What is the purpose of GFP_HARDWALL flag in Linux kernel?

like image 331
user2633051 Avatar asked Oct 24 '13 12:10

user2633051


2 Answers

It restricts allocation to the current cpuset, where a cpuset is a (guess!) set of CPUs and memory nodes.

Basically, your user process may be confined in a cpuset consisting of CPU#1 and CPU#2 but not CPU#3 nor CPU#4. Maybe there is some memory MEM#1 which is local to CPUs #1 and #2 but not the others, so this memory is part of the cpuset. Maybe there is some other memory MEM#2 local to CPUs #3 and #4, so this one is not part of the cpuset.

__GFP_HARDWALL ensures that you cannot allocate from MEM#2.

like image 59
Hervé Avatar answered Oct 23 '22 04:10

Hervé


I can't for sure tell. You are probably referring to __GFP_HARDWALL which is an internal symbol which you are not really supposed to look at. Nevertheless, here are my findings.

From include/linux/gfp.h:

Comment on #define __GFP_HARDWALL:

/* Enforce hardwall cpuset memory allocs */

Not that I really understand what hardwall means in that sentence, but you might. __GFP_HARDWALL is used in the definitions of GFP_USER, GFP_HIGHUSER and GFP_HIGHUSER_MOVABLE as well as GFP_CONSTRAINT_MASK. From the first three, I'm guessing it has something to do with user-space.

From kernel/cpuset.c:

Comments on the __cpuset_node_allowed_softwall function (partly omitted):

* cpuset_node_allowed_softwall - Can we allocate on a memory node?
* ...
* If we're in interrupt, yes, we can always allocate. If __GFP_THISNODE is
* set, yes, we can always allocate. If node is in our task's mems_allowed,
* yes. If it's not a __GFP_HARDWALL request and this node is in the nearest
* hardwalled cpuset ancestor to this task's cpuset, yes. If the task has been
* OOM killed and has access to memory reserves as specified by the TIF_MEMDIE
* flag, yes.
* Otherwise, no.
*
* If __GFP_HARDWALL is set, cpuset_node_allowed_softwall() reduces to
* cpuset_node_allowed_hardwall(). Otherwise, cpuset_node_allowed_softwall()
* might sleep, and might allow a node from an enclosing cpuset.
*
* cpuset_node_allowed_hardwall() only handles the simpler case of hardwall
* cpusets, and never sleeps.
*
* <OMITTED>
*
* GFP_USER allocations are marked with the __GFP_HARDWALL bit,
* and do not allow allocations outside the current tasks cpuset
* unless the task has been OOM killed as is marked TIF_MEMDIE.
* GFP_KERNEL allocations are not so marked, so can escape to the
* nearest enclosing hardwalled ancestor cpuset.
*
* Scanning up parent cpusets requires callback_mutex. The
* __alloc_pages() routine only calls here with __GFP_HARDWALL bit
* _not_ set if it's a GFP_KERNEL allocation, and all nodes in the
* current tasks mems_allowed came up empty on the first pass over
* the zonelist. So only GFP_KERNEL allocations, if all nodes in the
* cpuset are short of memory, might require taking the callback_mutex
* mutex.
*
* The first call here from mm/page_alloc:get_page_from_freelist()
* has __GFP_HARDWALL set in gfp_mask, enforcing hardwall cpusets,
* so no allocation on a node outside the cpuset is allowed (unless
* in interrupt, of course).
*
* <OMITTED>
*
* Rule:
* Don't call cpuset_node_allowed_softwall if you can't sleep, unless you
* pass in the __GFP_HARDWALL flag set in gfp_flag, which disables
* the code that might scan up ancestor cpusets and sleep.

In the same file, there are also references to hardwall cpuset and hardwall memory. Still not sure what exactly hardwall means, but let's follow it to cpuset.

(There's a lot of mentions to hardwall in the tile architecture, but since that's the only one, I believe it's not related to what we are talking about here).

I hit the jackpot. The documentation on cpusets says:

1.4 What are exclusive cpusets ?

If a cpuset is cpu or mem exclusive, no other cpuset, other than a direct ancestor or descendant, may share any of the same CPUs or Memory Nodes.

A cpuset that is cpuset.mem_exclusive or cpuset.mem_hardwall is "hardwalled", i.e. it restricts kernel allocations for page, buffer and other data commonly shared by the kernel across multiple users. All cpusets, whether hardwalled or not, restrict allocations of memory for user space. This enables configuring a system so that several independent jobs can share common kernel data, such as file system pages, while isolating each job's user allocation in its own cpuset. To do this, construct a large mem_exclusive cpuset to hold all the jobs, and construct child, non-mem_exclusive cpusets for each individual job. Only a small amount of typical kernel memory, such as requests from interrupt handlers, is allowed to be taken outside even a mem_exclusive cpuset.

I'm going to leave you with this because any conclusion I make might in fact turn out wrong. Hopefully someone more knowledgeable in this particular field would come around and enlighten us.

like image 28
Shahbaz Avatar answered Oct 23 '22 06:10

Shahbaz