I'm studying process execution on Linux 2.6.32 on a 64-bit box. While studying the outputs of /proc/$PID/maps
, I observed one thing:
$ cat /proc/2203/maps | head -1
00400000-004d9000 r-xp 00000000 08:02 1050631 /bin/bash
$ cat /proc/27032/maps | head -1
00400000-00404000 r-xp 00000000 08:02 771580 /sbin/getty
It seems that the maps
file for all the programs shows that the executable code for each program is loaded in a block of memory beginning at 0x00400000
.
I understand that these are virtual addresses. However, I don't get how these addresses can be the same for multiple concurrently running processes. What is the reason behind using a common start address for loading all processes, and how does the OS distinguish between the virtual load point of one process from another?
Edit:
From my understanding of address space virtualization using paging, I thought part of the virtual address was used to look up the physical address of a memory block (a frame) by using it to index one or more page tables. Consider this case. The address looks 32-bit (this is another thing that baffles me -- why are the program addresses 32-bit, but the addresses of the loaded libraries are 64-bit?). Breaking the address into ten, ten, and twelve bits corresponding to the page directory entry, the page table entry, and the page offset respectively, shouldn't 0x00400000
always mean "page directory entry 1, page table entry 0, offset 0", no matter what program performs the address translation?
One way I can see how this can be done is if the OS modified the page directory entry #1 to point to the page table corresponding to the program each time a task switch is performed. If that's the case, it sounds like a lot of added complexity -- given that program code is position-independent, won't it be easier to just load the program at an arbitrary virtual address and just go from there?
The answer is that each process has its own page tables. They are switched when processes are switched.
More information at http://www.informit.com/articles/article.aspx?p=101760&seqNum=3.
The kernel switches the page tables when a context switch happens. On operating systems where the kernel is mapped into every process, the kernel pages can remain. On the other hand, operating systems (32bit) which provide 4GiB to user-processes have to do a context switch when going into the kernel (a syscall) as well.
While virtual addressing doesn't require different processes to have different page tables, (the dependency goes the other way), I can't think of any current operating systems that don't give each process its own page tables.
Q: I understand that these are virtual addresses.
A: Good...
Q: However, I don't get how these addresses can be the same for multiple concurrently running processes.
A: I thought you just said you understood "virtual addresses" ;)?
Q: What is the reason behind using a common start address for loading all processes?
A: Remember, it's a virtual address - not a physical address. Why not have some standard start address?
And remember - you don't want to make the start address "0" - there are a lot of specific virtual addresses (especially those under 640K) that a process might wish to map as though it were a physical address.
Here's a good article that touches on a few of these issues. Including "e_entry":
How main() is executed on Linux
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With