Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Linux determine the next PID?

How does Linux determine the next PID it will use for a process? The purpose of this question is to better understand the Linux kernel. Don't be afraid to post kernel source code. If PIDs are allocated sequentially how does Linux fill in the gaps? What happens when it hits the end?

For example if I run a PHP script from Apache that does a <?php print(getmypid());?> the same PID will be printed out for a few minutes while hit refresh. This period of time is a function of how many requests apache is receiving. Even if there is only one client the PID will eventually change.

When the PID changes, it will be a close number, but how close? The number does not appear to be entirely sequential. If I do a ps aux | grep apache I get a fair number of processes:

enter image description here

How does Linux choose this next number? The previous few PID's are still running, as well as the most recent PID that was printed. How does apache choose to reuse these PIDs?

like image 865
rook Avatar asked Aug 10 '10 07:08

rook


People also ask

How does Linux assign PIDs?

When we launch a process, a PID for the process is generated to allow uniquely identifying it. This is done simply by incrementing the current highest PID by 1. Firstly, we calculated the highest PID on the system. Next, we launched four readlink processes, each of which checks the new PID assigned to them.

How does PID get assigned?

A PID (i.e., process identification number) is an identification number that is automatically assigned to each process when it is created on a Unix-like operating system. A process is an executing (i.e., running) instance of a program. Each process is guaranteed a unique PID, which is always a non-negative integer.

Are PIDs assigned sequentially?

The PID are assigned in sequential order until maximum limit is reached. After this limit it will start over again from zero. So it is just that the missing PIDs in ps -ef are of dead processes. Note that ps -ef lists only running processes.

How do you determine a PID?

There's no single test for diagnosing pelvic inflammatory disease (PID). It's diagnosed based on your symptoms and a gynaecological examination. Your doctor will first ask about your medical and sexual history.


2 Answers

The kernel allocates PIDs in the range of (RESERVED_PIDS, PID_MAX_DEFAULT). It does so sequentially in each namespace (tasks in different namespaces can have the same IDs). In case the range is exhausted, pid assignment wraps around.

Some relevant code:

Inside alloc_pid(...)

for (i = ns->level; i >= 0; i--) {     nr = alloc_pidmap(tmp);     if (nr < 0)         goto out_free;     pid->numbers[i].nr = nr;     pid->numbers[i].ns = tmp;     tmp = tmp->parent; } 

alloc_pidmap()

static int alloc_pidmap(struct pid_namespace *pid_ns) {         int i, offset, max_scan, pid, last = pid_ns->last_pid;         struct pidmap *map;          pid = last + 1;         if (pid >= pid_max)                 pid = RESERVED_PIDS;         /* and later on... */         pid_ns->last_pid = pid;         return pid; } 

Do note that PIDs in the context of the kernel are more than just int identifiers; the relevant structure can be found in /include/linux/pid.h. Besides the id, it contains a list of tasks with that id, a reference counter and a hashed list node for fast access.

The reason for PIDs not appearing sequential in user space is because kernel scheduling might fork a process in between your process' fork() calls. It's very common, in fact.

like image 160
Michael Foukarakis Avatar answered Sep 20 '22 14:09

Michael Foukarakis


I would rather assume the behavior you watch stems from another source:

Good web servers usually have several process instances to balance the load of the requests. These processes are managed in a pool and assigned to a certain request each time a request comes in. To optimize performance Apache probably assigns the same process to a bunch of sequential requests from the same client. After a certain amount of requests that process is terminated and a new one is created.

I don't believe that more than one processes in sequence are assigned the same PID by linux.

As you say that the new PID is gonna be close to the last one, I guess Linux simply assigns each process the last PID + 1. But there are processes popping up and being terminated all the time in background by applications and system programs, thus you cannot predict the exact number of the apache process being started next.

Apart from this, you should not use any assumption about PID assignment as a base for something you implement. (See also sanmai's comment.)

like image 42
chiccodoro Avatar answered Sep 17 '22 14:09

chiccodoro