Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

more info on Memory layout of an executable program (process)

I attended interview for samsung. They asked lot of questions on memory layout of the program. I barely know anything about this.

I googled it "Memory layout of an executable program". "Memory layout of process".

I'm surprised to see that there isn't much info on these topics. Most of the results are forum queries. I just wonder why?

These are the few links I found:

  1. Run-Time Storage Organization
  2. Run-Time Memory Organization
  3. Memory layout of C process ^pdf^

I want to learn this from a proper book instead of some web links.(Randy Hyde's is also a book but some other book). In which book can I find clear & more information on this subject?

I also wonder, why didn't the operating systems book cover this in their books? I read stallings 6th edition. It just discusses the Process Control Block.

This entire creation of layout is task of linker right? Where can I read more about this process. I want COMPLETE info from a program on the disk to its execution on the processor.

EDIT:

Initially, I was not clear even after reading the answers given below. Recently, I came across these articles after reading them, I understood things clearly.

Resources that helped me in understanding:

  1. www.tenouk.com/Bufferoverflowc/Bufferoverflow1b.html
  2. 5 part PE file format tutorial: http://win32assembly.online.fr/tutorials.html
  3. Excellent article : http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html
  4. PE Explorer: http://www.heaventools.com/

Yes, "layout of an executable program(PE/ELF)" != "Memory layout of process"). Findout for yourself in the 3rd link. :)

After clearing my concepts, my questions are making me look so stupid. :)

like image 373
claws Avatar asked Dec 27 '09 20:12

claws


People also ask

What is memory layout of a program?

After being loaded into the RAM, memory layout in C Program has six components which are text segment, initialized data segment, uninitialized data segment, command-line arguments, stack, and heap. Each of these six different segments stores different parts of code and have their own read, write permissions.

What is the memory layout of C program?

Basically, the memory layout of C program contains five segments these are the stack segment, heap segment, BSS (block started by symbol), DS (Data Segment) and text segment. Each segment has own read, write and executable permission.

What is an executable process?

An executable is the file that the compiler creates from these source files containing machine instructs that can execute on the CPU. A process is the active execution of the executable on the CPU and in the memory. It includes the memory management information, the current PC, SP, HP, registers, etc.

How many types of memory are there in C?

C Memory Model The C runtime memory model can be divided in to three types; global/static memory, the heap, and the stack.


2 Answers

How things are loaded depends very strongly on the OS and on the binary format used, and the details can get nasty. There are standards for how binary files are laid out, but it's really up to the OS how a process's memory is laid out. This is probably why the documentation is hard to find.

To answer your questions:

  1. Books:
    • If you're interested in how processes lay out their memory, look at Understanding the Linux Kernel. Chapter 3 talks about process descriptors, creating processes, and destroying processes.
    • The only book I know of that covers linking and loading in any detail is Linkers and Loaders by John Levine. There's an online and a print version, so check that out.

  2. Executable code is created by the compiler and the linker, but it's the linker that puts things in the binary format the OS needs. On Linux, this format is typically ELF, on Windows and older Unixes it's COFF, and on Mac OS X it's Mach-O. This isn't a fixed list, though. Some OS's can and do support multiple binary formats. Linkers need to know the output format to create executable files.

  3. The process's memory layout is pretty similar to the binary format, because a lot of binary formats are designed to be mmap'd so that the loader's task is easier.

    It's not quite that simple though. Some parts of the binary format (like static data) are not stored directly in the binary file. Instead, the binary just contains the size of these sections. When the process is loaded into memory, the loader knows to allocate the right amount of memory, but the binary file doesn't need to contain large empty sections.

    Also, the process's memory layout includes some space for the stack and the heap, where a process's call frames and dynamically allocated memory go. These generally live at opposite ends of a large address space.

This really just scratches the surface of how binaries get loaded, and it doesn't cover anything about dynamic libraries. For a really detailed treatment of how dynamic linking and loading work, read How to Write Shared Libraries.

like image 107
Todd Gamblin Avatar answered Oct 20 '22 15:10

Todd Gamblin


Here is one way a program can be executed from a file (*nix).

  • The process is created (e.g. fork()). This gives the new process its own memory map. This includes a stack in some area of memory (usually high up in memory somewhere).
  • The new process calls exec() to replace the current executable (often a shell) with the new executable. Often, the new executables .text (executable code and constants) and .data (r/w initialized variables) are set up for demand page mapping, that is, they are mapped into the process memory space as needed. Often, the .text section comes first, followed by .data. The .bss section (uninitialized variables) is often allocated after the .data section. Many times it is mapped to return a page of zeros when the page containing a bss variable is first accessed. The heap often starts at the next page boundary after the .bss section. The heap then grows up in memory while the stack grows down (remember I said usually, there are exceptions!).

If the heap and stack collide, that often causes an out of memory situation, which is why the stack is often placed in high memory.

In a system without a memory management unit, demand paging is usually unavailable but the same memory layout is often used.

like image 3
Richard Pennington Avatar answered Oct 20 '22 15:10

Richard Pennington