Take a simple example:
int a = 5;
I know 5 gets stored into a memory block.
My area of interest is where does the variable 'a' get stored?
Related Sub-questions: Where does it happen where 'a' gets associated to the memory block that contains the primitive value of 5? Is there another memory block created to hold 'a'? But that will seem as though a is a pointer to an object, but it's a primitive type involved here.
Primitive Data Types. The eight primitives defined in Java are int, byte, short, long, float, double, boolean and char. These aren't considered objects and represent raw values. They're stored directly on the stack (check out this article for more information about memory management in Java).
Thus, you have seen that all primitive data types are stored on the stack, and in the case of reference type, stack holds a pointer to the object on the heap.
Stack memory stores primitive types and the addresses of objects. The object values are stored in heap memory.
However, Java provides support for character strings using the String class of Java. lang package. String class has some special support from the Java Programming language, so, technically it is a primitive data type. While using String class, a character string will automatically create a new String Object.
To expound on Do Java primitives go on the Stack or the Heap? -
Lets say you have a function foo()
:
void foo() {
int a = 5;
system.out.println(a);
}
Then when the compiler compiles that function, it'll create bytecode instructions that leave 4 bytes of room on the stack whenever that function is called. The name 'a' is only useful to you - to the compiler, it just creates a spot for it, remembers where that spot is, and everywhere where it wants to use the value of 'a' it instead inserts references to the memory location it reserved for that value.
If you're not sure how the stack works, it works like this: every program has at least one thread, and every thread has exactly one stack. The stack is a continuous block of memory (that can also grow if needed). Initially the stack is empty, until the first function in your program is called. Then, when your function is called, your function allocates room on the stack for itself, for all of its local variables, for its return types etc.
When your function main
call another function foo
, here's one example of what could happen (there are a couple simplifying white lies here):
main
wants to pass parameters to foo
. It pushes those values onto the top of the stack in such a way that foo
will know exactly where they will be put (main
and foo
will pass parameters in a consistent way).main
pushes the address of where program execution should return to after foo
is done. This increments the stack pointer.main
calls foo
.foo
starts, it sees that the stack is currently at address Xfoo
wants to allocate 3 int
variables on the stack, so it needs 12 bytes.foo
will use X + 0 for the first int, X + 4 for the second int, X + 8 for the third.
main
pushed on the stack before calling foo
can also be accessed by foo
by computing some offset from the stack pointer.
foo
knows how many parameters it takes (say 3) so it knows that, say, X - 8 is the first one, X - 12 is the second one, and X - 16 is the third one. foo
has room on the stack to do its work, it does so and finishesmain
called foo
, main
wrote its return address on the stack before incrementing the stack pointer.foo
looks up the address to return to - say that address is stored at ESP - 4
- foo
looks at that spot on the stack, finds the return address there, and jumps to the return address.main
continues to run and we've made a full round trip.Note that each time a function is called, it can do whatever it wants with the memory pointed to by the current stack pointer and everything after it. Each time a function makes room on the stack for itself, it increments the stack pointer before calling other functions to make sure that everybody knows where they can use the stack for themselves.
I know this explanation blurs the line between x86 and java a little bit, but I hope it helps to illustrate how the hardware actually works.
Now, this only covers 'the stack'. The stack exists for each thread in the program and captures the state of the chain of function calls between each function running on that thread. However, a program can have several threads, and so each thread has its own independent stack.
What happens when two function calls want to deal with the same piece of memory, regardless of what thread they're on or where they are in the stack?
This is where the heap comes in. Typically (but not always) one program has exactly one heap. The heap is called a heap because, well, it's just a big ol heap of memory.
To use memory in the heap, you have to call allocation routines - routines that find unused space and give it to you, and routines that let you return space you allocated but are no longer using. The memory allocator gets big pages of memory from the operating system, and then hands out individual little bits to whatever needs it. It keeps track of what the OS has given to it, and out of that, what it has given out to the rest of the program. When the program asks for heap memory, it looks for the smallest chunk of memory that it has available that fits the need, marks that chunk as being allocated, and hands it back to the rest of the program. If it doesn't have any more free chunks, it could ask the operating system for more pages of memory and allocate out of there (up until some limit).
In languages like C, those memory allocation routines I mentioned are usually called malloc()
to ask for memory and free()
to return it.
Java on the other hand doesn't have explicit memory management like C does, instead it has a garbage collector - you allocate whatever memory you want, and then when you're done, you just stop using it. The Java runtime environment will keep track of what memory you've allocated, and will scan your program to find out if you're not using all of your allocations any more and will automatically deallocate those chunks.
So now that we know that memory is allocated on the heap or the stack, what happens when I create a private variable in a class?
public class Test {
private int balance;
...
}
Where does that memory come from? The answer is the heap. You have some code that creates a new Test
object - Test myTest = new Test()
. Calling the java new
operator causes a new instance of Test
to be allocated on the heap. Your variable myTest
stores the address to that allocation. balance
is then just some offset from that address - probably 0 actually.
The answer at the very bottom is all just .. accounting.
...
The white lies I spoke about? Let's address a few of those.
Java is first a computer model - when you compile your program to bytecode, you're compiling to a completely made-up computer architecture that doesn't have registers or assembly instructions like any other common CPU - Java, and .Net, and a few others, use a stack-based processor virtual machine, instead of a register-based machine (like x86 processors). The reason is that stack based processors are easier to reason about, and so its easier to build tools that manipulate that code, which is especially important to build tools that compile that code to machine code that will actually run on common processors.
The stack pointer for a given thread typically starts at some very high address and then grows down, instead of up, at least on most x86 computers. That said, since that's a machine detail, it's not actually Java's problem to worry about (Java has its own made-up machine model to worry about, its the Just In Time compiler's job to worry about translating that to your actual CPU).
I mentioned briefly how parameters are passed between functions, saying stuff like "parameter A is stored at ESP - 8, parameter B is stored at ESP - 12" etc. This generally called the "calling convention", and there are more than a few of them. On x86-32, registers are sparse, and so many calling conventions pass all parameters on the stack. This has some tradeoffs, particularly that accessing those parameters might mean a trip to ram (though cache might mitigate that). x86-64 has a lot more named registers, which means that the most common calling conventions pass the first few parameters in registers, which presumably improves speed. Additionally, since the Java JIT is the only guy that generates machine code for the entire process (excepting native calls), it can choose to pass parameters using any convention it wants.
I mentioned how when you declare a variable in some function, the memory for that variable comes from the stack - that's not always true, and it's really up to the whims of the environment's runtime to decide where to get that memory from. In C#/DotNet's case, the memory for that variable could come from the heap if the variable is used as part of a closure - this is called "heap promotion". Most languages deal with closures by creating hidden classes. So what often happens is that the method local members that are involved in closures are rewritten to be members of some hidden class, and when that method is invoked, instead allocate a new instance of that class on the heap and stores its address on the stack; and now all references to that originally-local variable occur instead through that heap reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With