I am trying to understand the actual process behind object creations in Java - and I suppose other programming languages.
Would it be wrong to assume that object initialization in Java is the same as when you use malloc for a structure in C?
Example:
Foo f = new Foo(10);
typedef struct foo Foo;
Foo *f = malloc(sizeof(Foo));
Is this why objects are said to be on the heap rather than the stack? Because they are essentially just pointers to data?
In C, malloc()
allocates a region of memory in the heap and returns a pointer to it. That's all you get. Memory is uninitialized and you have no guarantee that it's all zeros or anything else.
In Java, calling new
does a heap based allocation just like malloc()
, but you also get a ton of additional convenience (or overhead, if you prefer). For example, you don't have to explicitly specify the number of bytes to be allocated. The compiler figures it out for you based on the type of object you're trying to allocate. Additionally, object constructors are called (which you can pass arguments to if you'd like to control how initialization occurs). When new
returns, you're guaranteed to have an object that's initialized.
But yes, at the end of the call both the result of malloc()
and new
are simply pointers to some chunk of heap-based data.
The second part of your question asks about the differences between a stack and a heap. Far more comprehensive answers can be found by taking a course on (or reading a book about) compiler design. A course on operating systems would also be helpful. There are also numerous questions and answers on SO about the stacks and heaps.
Having said that, I'll give a general overview I hope isn't too verbose and aims to explain the differences at a fairly high level.
Fundamentally, the main reason to have two memory management systems, i.e. a heap and a stack, is for efficiency. A secondary reason is that each is better at certain types of problems than the other.
Stacks are somewhat easier for me to understand as a concept, so I start with stacks. Let's consider this function in C...
int add(int lhs, int rhs) {
int result = lhs + rhs;
return result;
}
The above seems fairly straightforward. We define a function named add()
and pass in the left and right addends. The function adds them and returns a result. Please ignore all edge-case stuff such as overflows that might occur, at this point it isn't germane to the discussion.
The add()
function's purpose seems pretty straightforward, but what can we tell about its lifecycle? Especially its memory utilization needs?
Most importantly, the compiler knows a priori (i.e. at compile time) how large the data types are and how many will be used. The lhs
and rhs
arguments are sizeof(int)
, 4 bytes each. The variable result
is also sizeof(int)
. The compiler can tell that the add()
function uses 4 bytes * 3 ints
or a total of 12 bytes of memory.
When the add()
function is called, a hardware register called the stack pointer will have an address in it that points to the top of the stack. In order to allocate the memory the add()
function needs to run, all the function-entry code needs to do is issue one single assembly language instruction to decrement the stack pointer register value by 12. In doing so, it creates storage on the stack for three ints
, one each for lhs
, rhs
, and result
. Getting the memory space you need by executing a single instruction is a massive win in terms of speed because single instructions tend to execute in one clock tick (1 billionth of a second a 1 GHz CPU).
Also, from the compiler's view, it can create a map to the variables that looks an awful lot like indexing an array:
lhs: ((int *)stack_pointer_register)[0]
rhs: ((int *)stack_pointer_register)[1]
result: ((int *)stack_pointer_register)[2]
Again, all of this is very fast.
When the add()
function exits it has to clean up. It does this by subtracting 12 bytes from the stack pointer register. It's similar to a call to free()
but it only uses one CPU instruction and it only takes one tick. It's very, very fast.
Now consider a heap-based allocation. This comes into play when we don't know a priori how much memory we're going to need (i.e. we'll only learn about it at runtime).
Consider this function:
int addRandom(int count) {
int numberOfBytesToAllocate = sizeof(int) * count;
int *array = malloc(numberOfBytesToAllocate);
int result = 0;
if array != NULL {
for (i = 0; i < count; ++i) {
array[i] = (int) random();
result += array[i];
}
free(array);
}
return result;
}
Notice that the addRandom()
function doesn't know at compile time what the value of the count
argument will be. Because of this, it doesn't make sense to try to define array
like we would if we were putting it on the stack, like this:
int array[count];
If count
is huge it could cause our stack to grow too large and overwrite other program segments. When this stack overflow happens your program crashes (or worse).
So, in cases where we don't know how much memory we'll need until runtime, we use malloc()
. Then we can just ask for the number of bytes we need when we need it, and malloc()
will go check if it can vend that many bytes. If it can, great, we get it back, if not, we get a NULL pointer which tells us the call to malloc()
failed. Notably though, the program doesn't crash! Of course you as the programmer can decide that your program isn't allowed to run if resource allocation fails, but programmer-initiated termination is different than a spurious crash.
So now we have to come back to look at efficiency. The stack allocator is super fast - one instruction to allocate, one instruction to deallocate, and it's done by the compiler, but remember the stack is meant for things like local variables of a known size so it tends to be fairly small.
The heap allocator on the other hand is several orders of magnitude slower. It has to do a lookup in tables to see if it has enough free memory to be able to vend the amount of memory the user wants. It has to update those tables after it vends the memory to make sure no one else can use that block (this bookkeeping may require the allocator to reserve memory for itself in addition to what it plans to vend). The allocator has to employ locking strategies to make sure it vends memory in a thread-safe way. And when memory is finally free()
d, which happens at different times and in no predictable order typically, the allocator has to find contiguous blocks and stitch them back together to repair heap fragmentation. If that sounds like it's going to take more than a single CPU instruction to accomplish all of that, you're right! It's very complicated and it takes a while.
But heaps are big. Much larger than stacks. We can get lots of memory from them and they're great when we don't know at compile time how much memory we'll need. So we trade off speed for a managed memory system that declines us politely instead of crashing when we try to allocate something too large.
I hope that helps answer some of your questions. Please let me know if you'd like clarification on any of the above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With