Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrent Programming, Stacks and Heaps in C/C++

Well, I am sorry if this feels like a repetition of old questions, I have gone through several questions on Stack Overflow, the Modern Operating Systems Book by tanenbaum, and have still to clear my doubts regarding this.

First off, I would appreciate any book/resource that I should go through in more detail to better understand this structure. I fail to understand if these are concepts generally explained in OS books, or Programming Languages or Architecture books.

Before I ask my questions, I will list out my findings based on readings about stacks/heaps

Heap

  • Contains all Instance Variables, Dynamically Allocated (new/malloc), and Global Variables only
  • Does not use the data structure heap anymore, uses more complex structures
  • Access through memory locations, individual process responsible for memory allocated on it
  • Defragmentation, and allocation of memory is done by the OS (If yes or no, please answer my question on who manages the heap, os or runtime environ)
  • Shared among all threads within the process which have access to its reference

Stack

  • Contains all local variables only. (Pushed on at function call)
  • Uses an actual Stack Data Structure for operation
  • Faster to access due to contiguous nature

Now, For a few of my questions regarding the same.

  1. Global Variables, where do they get allocated? (My belief is that they get allocated on the heap, If so, When do they get allocated, at runtime or compile time, and one further question, can this memory be cleared (as in using delete)? )
  2. What is the structure of the heap? How is the heap organized (is it managed by the os or the run time environment (as set up by the C/C++ compiler) ).
  3. Does the stack hold ONLY method, and their local variables?
  4. Each application (Process) is given a separate heap, but if you exceed heap allocations, then does it mean that the os was not able to allocate more memory? (I am assuming lack of memory causes the OS to reallocate to avoid fragmentation)
  5. The Heap is accessible from all threads within the process (I believe this to be true). If yes all threads can access Instance Variables, Dynamically allocated variables, global variables (If they have a reference to it)
  6. Different processes, cannot access each others heap (even if they are passed the address)
  7. A Stack overflow crashes
    • only the current thread
    • current process
    • all processes
  8. In C/C++, does memory get allocated during run time on the stack for block variables within a function (For example, if a sub-block (eg. For loop) of code creates a new variable is that allocated during run-time on the stack (or the heap) or is it preallocated?) when are they removed (Block level scope, how is that maintained). My belief on this is, all additions to stack are made at runtime before the start of a block, whenever the end of that block is reached, all elements added till that point are pushed.
  9. The CPU's support for the stack register is limited to a stack pointer that can be incremented (pop) and decremented (push) via normal access to memory. (Is this true?)
  10. Last, are both stack, and heap structures generated by the OS/Runtime environment that exist on Main Memory (As an abstraction?)

I know this is a lot, and I appear to be very confused throughout, I would appreciate it if you could point me in the right direction to get these things cleared up!

like image 265
Nicomoto Avatar asked Jan 24 '13 03:01

Nicomoto


1 Answers

  1. Global variables are allocated in a static section of memory that's laid out at compile time. The values are initialized during startup before main is entered. The initialization may, of course, allocate on the heap (i.e. a statically allocated std::string will have the structure itself sit in the statically laid out memory, but the string data it contains is allocated on the heap during startup). These things are deleted during normal program shutdown. You can't free them before then, if you wish to, you may want to wrap the value in a pointer, and initialize the pointer on program startup.

  2. The heap is managed by an allocator library. There's one that comes with the C runtime, but also custom ones like tcmalloc or jemalloc that you can use in place of the standard allocator. These allocator get large pages of memory from the OS using system calls, and then give you portions of these pages when you call malloc. The organization of the heap is somewhat complex and varies between allocators, you can look up how they work on their websites.

  3. Yes-ish. Though you can use library functions like alloca to make a chunk of space on the stack, and use that for whatever you want.

  4. Each process has a separate memory space, that is, it thinks it is all alone and no other process exists. Generally the OS will give you more memory if you ask for it, but it can also enforce limits (like ulimit on linux), at which time it can refuse to give you more memory. Fragmentation isn't an issue for the OS because it gives memory in pages. However fragmentation in your process may cause your allocator to ask for more pages, even if there's empty space.

  5. Yes.

  6. Yes, however there's generally OS specific ways to create shared-memory regions that multiple processes can access.

  7. stack overflows doesn't crash anything itself, it causes memory values to be written in places that may hold other values, thus corrupting it. Acting on corrupted memory causes crashes. When your process accesses unmapped memory (see note below) it crashes, not just a thread, but the whole process. It would not affect other processes since their memory spaces are isolated. (This is not true in old operating systems like Windows 95, where all processes shared the same memory space).

  8. In C++, stack-allocated objects are created when the block is entered, and destroyed when the block is exited. The actual space on the stack may be allocated less precisely though, but the construction and destruction will take place at those particular points.

  9. The stack pointer on x86 processes can be arbitrarily manipulated. It's common for compilers to generate code that simply add the amount of space to the stack pointer, and then set the memory for values on the stack, instead of doing a bunch of push operations.

  10. The stacks and heap of the process all live in the same memory space.

An overview of how memory is organized may be helpful:

  • You have physical memory which the kernel sees.
  • The kernel maps pages of physical memory to pages of virtual memory when a process asks for it.
  • A process operates in its own virtual memory space, ignorant of other processes on the system.
  • When a process starts, it puts sections of the executable (code, globals, etc) into some of these virtual memory pages.
  • The allocator requests pages from the process in order to satisfy malloc calls, this memory constitutes the heap.
  • when a thread starts (or the initial thread for the process), it asks the OS for a few pages that form the stack. (You can also ask your heap allocator, and use the space it gives you as a stack).
  • When the program is running, it can freely access to all memory in its address space, heap, stack, whatever.
  • When you attempt to access a region of your memory space that is not mapped, your program crashes. (more specifically, you get a signal from the OS, which you can choose to handle).
  • Stack overflows tend to cause your programs to access such unmapped regions, which is why stack overflows tend to crash your program.
like image 59
yiding Avatar answered Oct 14 '22 13:10

yiding