Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a process's address space divided into four segments (text, data, stack and heap)?

Why does a process's address space have to divide into four segments (text, data, stack and heap)? What is the advandatage? is it possible to have only one whole big segment?

like image 379
Newbie Avatar asked Oct 24 '11 09:10

Newbie


People also ask

What is the text segment of the process address space used for?

text segment The text segment (sometimes called the instruction segment) contains the executable program code and constant data. The text segment is marked by the operating system as read-only and cannot be modified by the process.

Is heap part of the data segment?

Heap is the segment where dynamic memory allocation usually takes place. When some more memory need to be allocated using malloc and calloc function, heap grows upward. The Heap area is shared by all shared libraries and dynamically loaded modules in a process.

What is data segment and stack segment?

automatic variables - are in the stack - the memory segment is called. Stack Segment. global variables+static variables +variables allocated using malloc/new - are in heap - this memory segment is called Data Segment.

In which segment is the heap?

Heap is the segment where dynamic memory allocation usually takes place. The heap area begins at the end of the BSS segment and grows to larger addresses from there.


1 Answers

There are multiple reasons for splitting programs into parts in memory.

One of them is that instruction and data memories can be architecturally distinct and discontiguous, that is, read and written from/to using different instructions and circuitry inside and outside of the CPU, forming two different address spaces (i.e. reading code from address 0 and reading data from address 0 will typically return two different values, from different memories).

Another is reliability/security. You rarely want the program's code and constant data to change. Most of the time when that happens, it happens because something is wrong (either in the program itself or in its inputs, which may be maliciously constructed). You want to prevent that from happening and know if there are any attempts. Likewise you don't want the data areas that can change to be executable. If they are and there are security bugs in the program, the program can be easily forced to do something harmful when malicious code makes it into the program data areas as data and triggers those security bugs (e.g. buffer overflows).

Yet another is storage... In many programs a number of data areas aren't initialized at all or are initialized to one common predefined value (often 0). Memory has to be reserved for these data areas when the program is loaded and is about to start, but these areas don't need to be stored on the disk, because there's no meaningful data there.

On some systems you may have everything in one place (section/segment/etc). One notable example here is MSDOS, where .COM-style programs have no structure other than that they have to be less than about 64KB in size and the first executable instruction must appear at the very beginning of file and assume that its location corresponds to IP=0x100 (where IP is the instruction pointer register). How code and data are placed and interleaved in a .COM program is unimportant and up to the programmer.

There are other architectural artifacts such as x86 segments. Again, MSDOS is a good example of an OS that deals with them. .EXE-style programs in it may have multiple segments in them that correspond directly to the x86 CPU segments, to the real-mode addressing scheme, in which memory is viewed through 64KB-long "windows" known as segments. The position of these windows/segments is relative to the value of the CPU's segment registers. By altering the segment register values you can move the "windows". In order to access more than 64KB one needs to use different segment register values and that often implies having multiple segments in the .EXE (can be not just one segment for code and one for data, but also multiple segments for either of them).

like image 191
Alexey Frunze Avatar answered Nov 15 '22 07:11

Alexey Frunze