Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where memory segments are defined?

I just learned about different memory segments like: Text, Data, Stack and Heap. My question is:

1- Where the boundaries between these sections are defined? Is it in Compiler or OS?

2- How the compiler or OS know which addresses belong to each section? Should we define it anywhere?

like image 396
doubleE Avatar asked Dec 07 '22 20:12

doubleE


2 Answers

This answer is from the point of view of a more special-purpose embedded system rather than a more general-purpose computing platform running an OS such as Linux.

Where the boundaries between these sections are defined? Is it in Compiler or OS?

Neither the compiler nor the OS do this. It's the linker that determines where the memory sections are located. The compiler generates object files from the source code. The linker uses the linker script file to locate the object files in memory. The linker script (or linker directive) file is a file that is a part of the project and identifies the type, size and address of the various memory types such as ROM and RAM. The linker program uses the information from the linker script file to know where each memory starts. Then the linker locates each type of memory from an object file into an appropriate memory section. For example, code goes in the .text section which is usually located in ROM. Variables go in the .data or .bss section which are located in RAM. The stack and heap also go in RAM. As the linker fills one section it learns the size of that section and can then know where to start the next section. For example, the .bss section may start where the .data section ended.

The size of the stack and heap may be specified in the linker script file or as project options in the IDE.

IDEs for embedded systems typically provide a generic linker script file automatically when you create a project. The generic linker file is suitable for many projects so you may never have to customize it. But as you customize your target hardware and application further you may find that you also need to customize the linker script file. For example, if you add an external ROM or RAM to the board then you'll need to add information about that memory to the linker script so that the linker knows how to locate stuff there.

The linker can generate a map file which describes how each section was located in memory. The map file may not be generated by default and you may need to turn on a build option if you want to review it.

How the compiler or OS know which addresses belong to each section?

Well I don't believe the compiler or OS actually know this information, at least not in the sense that you could query them for the information. The compiler has finished its job before the memory sections are located by the linker so the compiler doesn't know the information. The OS, well how do I explain this? An embedded application may not even use an OS. The OS is just some code that provides services for an application. The OS doesn't know and doesn't care where the boundaries of memory sections are. All that information is already baked into the executable code by the time the OS is running.

Should we define it anywhere?

Look at the linker script (or linker directive) file and read the linker manual. The linker script is input to the linker and provides the rough outlines of memory. The linker locates everything in memory and determines the extent of each section.

like image 90
kkrambo Avatar answered Dec 11 '22 10:12

kkrambo


For your Query :-

Where the boundaries between these sections are defined? Is it in Compiler or OS?

Answer is OS.

There is no universally common addressing scheme for the layout of the .text segment (executable code), .data segment (variables) and other program segments. However, the layout of the program itself is well-formed according to the system (OS) that will execute the program.

How the compiler or OS know which addresses belong to each section? Should we define it anywhere?

I divided your this question into 3 questions :-

About the text (code) and data sections and their limitation?

Text and Data are prepared by the compiler. The requirement for the compiler is to make sure that they are accessible and pack them in the lower portion of address space. The accessible address space will be limited by the hardware, e.g. if the instruction pointer register is 32-bit, then text address space would be 4 GiB.

About Heap Section and limit? Is it the total available RAM memory?

After text and data, the area above that is the heap. With virtual memory, the heap can practically grow up close to the max address space.

Do the stack and the heap have a static size limit?

The final segment in the process address space is the stack. The stack takes the end segment of the address space and it starts from the end and grows down.

Because the heap grows up and the stack grows down, they basically limit each other. Also, because both type of segments are writeable, it wasn't always a violation for one of them to cross the boundary, so you could have buffer or stack overflow. Now there are mechanism to stop them from happening.

There is a set limit for heap (stack) for each process to start with. This limit can be changed at runtime (using brk()/sbrk()). Basically what happens is when the process needs more heap space and it has run out of allocated space, the standard library will issue the call to the OS. The OS will allocate a page, which usually will be manage by user library for the program to use. I.e. if the program wants 1 KiB, the OS will give additional 4 KiB and the library will give 1 KiB to the program and have 3 KiB left for use when the program ask for more next time.

Most of the time the layout will be Text, Data, Heap (grows up), unallocated space and finally Stack (grows down). They all share the same address space.

like image 40
Sumit Gemini Avatar answered Dec 11 '22 10:12

Sumit Gemini