Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the relation between `task_struct` and `pid_namespace`?

I'm studying some kernel code and trying to understand how the data structures are linked together. I know the basic idea of how a scheduler works, and what a PID is. Yet I have no idea what a namespace is in this context, and can't figure out how all of those work together.

I have read some explanations (including parts of O'Reilly "Understanding the Linux Kernel") and understand that it could be that the same PID got to two processes because one has terminated and the ID got reallocated. But I can't figure out how all this is done.

So:

  1. What is a namespace in this context?
  2. What is the relation between task_struct and pid_namespace? (I already figured it has to do with pid_t, but don't know how)

Some references:

  • Definition of pid_namespace
  • Definition of task_struct
  • Definition of upid (see also pid just beneath it)
like image 627
Ramzi Khahil Avatar asked Nov 06 '14 12:11

Ramzi Khahil


People also ask

What is task_struct?

The task_struct is a relatively large data structure, at around 1.7 kilobytes on a 32- bit machine. This size, however, is quite small considering that the structure contains all the information that the kernel has and needs about a process.

Where is task_struct defined?

In the Linux kernel, processes are defined as task_struct structures in include/linux/sched. h , line 281. This structure contains every relevant information about a process. TASK_RUNNING identifies a process that is executing on a CPU, or waiting to be executed.

Where is task_struct in Linux kernel?

The task_struct structure is declared in include/linux/sched. h and is currently 1680 bytes in size.

What does the PID namespace do?

PID namespaces allow containers to provide functionality such as suspending/resuming the set of processes in the container and migrating the container to a new host while the processes inside the container maintain the same PIDs.


1 Answers

Perhaps these links might help:

  1. PID namespaces in operation
  2. A brief introduction to PID namespaces (this one comes from a sysadmin)

After going through the second link it becomes clear that namespaces are a great way to isolate resources. And in any OS, Linux included, processes are one of the most crucial resource there is. In his own words

Yes, that’s it, with this namespace it is possible to restart PID numbering and get your own “1″ process. This could be seen as a “chroot” in the process identifier tree. It’s extremely handy when you need to deal with pids in day to day work and are stuck with 4 digits numbers…

So you sort of create your own private process tree and then assign it to a specific user and/or to a specific task. Within this tree, the processes need not worry about PIDs conflicting with those outside this 'container'. Hence it is as good as handing over this tree to a different 'root' user altogether. That fine fellow has done a wonderful job of explaining the things with a nice little example to top it off, so I won't repeat it here.

As far as the kernel is concerned, I can give you a few pointers to get you started. I am not an expert here but I hope this should help you to some extent.

This LWN article, describes the older and the newer way of looking at PIDs. In it's own words:

All the PIDs that a task may have are described in the struct pid. This structure contains the ID value, the list of tasks having this ID, the reference counter and the hashed list node to be stored in the hash table for a faster search. A few more words about the lists of tasks. Basically a task has three PIDs: the process ID (PID), the process group ID (PGID), and the session ID (SID). The PGID and the SID may be shared between the tasks, for example, when two or more tasks belong to the same group, so each group ID addresses more than one task. With the PID namespaces this structure becomes elastic. Now, each PID may have several values, with each one being valid in one namespace. That is, a task may have PID of 1024 in one namespace, and 256 in another. So, the former struct pid changes. Here is how the struct pid looked like before introducing the PID namespaces:

struct pid {
 atomic_t count;                          /* reference counter */
 int nr;                                  /* the pid value */
 struct hlist_node pid_chain;             /* hash chain */
 struct hlist_head tasks[PIDTYPE_MAX];    /* lists of tasks */
 struct rcu_head rcu;                     /* RCU helper */
};

And this is how it looks now:

struct upid {
   int nr;                            /* moved from struct pid */
   struct pid_namespace *ns;          /* the namespace this value
                                       * is visible in */
   struct hlist_node pid_chain;       /* moved from struct pid */
};

struct pid {
   atomic_t count;
   struct hlist_head tasks[PIDTYPE_MAX];
   struct rcu_head rcu;
   int level;                     /* the number of upids */
   struct upid numbers[0];
};

As you can see, the struct upid now represents the PID value -- it is stored in the hash and has the PID value. To convert the struct pid to the PID or vice versa one may use a set of helpers like task_pid_nr(), pid_nr_ns(), find_task_by_vpid(), etc.

Though a bit dated, this information is fair enough to get you started. There's one more important structure that needs mention here. It is struct nsproxy. This structure is the focal point of all things namespace vis-a-vis the processes to which it is associated. It contains a pointer to the PID namespace that this process's children will use. The PID namespace for the current process is found using task_active_pid_ns.

Within struct task_struct, we have a namespace proxy pointer aptly called nsproxy, which points to this process's struct nsproxy structure. If you trace the steps needed to create a new process, you can find the relationship(s) between the task_struct, struct nsproxyand struct pid.

A new process in Linux is always forked out from an existing process and it's image later replaced using execve (or similar functions from the exec family). Thus as part of do_fork, copy_process is invoked.

As part of copying the parent process the following important things happen:

  1. task_struct is first duplicated using dup_task_struct.
  2. parent process's namespaces is also copied using copy_namespaces. This also creates a new nsproxy structure for the child and it's nsproxy pointer points to this newly created structure
  3. For a non INIT process (the original global PID aka the first process spawned on boot), a PID structure is allocated using alloc_pid which actually allocates a new PID structure for the newly forked process. A short snippet from this function:

    nr = alloc_pidmap(tmp);
    if(nr<0)
       goto out_free;
    pid->numbers[i].nr = nr;
    pid->numbers[i].ns = tmp;
    

This populates upid structure by giving it a new PID as well as the namespace to which it currently belongs.

Further as part of the copy process function, this newly allocated PID is then linked to the corresponding task_struct via function pid_nr i.e. it's global ID (which is the original PID nr as seem from the INIT namespace) is stored in the field pid in task_struct.

In the final stages of copy_process, a link is established between task_struct and this new pid structure through the pid_link field within task_struct through the function attach_pid.

Theres a lot more to it, but I hope this should at least give you some headstart.

NOTE: I am referring to the latest (as of now) kernel version viz. 3.17.2.

like image 170
HighOnMeat Avatar answered Sep 28 '22 22:09

HighOnMeat