Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrays in different languages - store references, or raw objects?

Tags:

java

arrays

c

jvm

I am trying to wrap my head around what the raw memory looks like in different languages when using an array.

Consider the following Java code:

String a = "hi";
String b = "there";
String c = "everyone";
String[] array = {a, b, c};

Obviously the array is holding references, and not objects; that is, there is a contiguous array in memory of three references which each points to some other location in memory where the object sits. So the objects themselves aren't necessarily sitting in three contiguous buckets; rather the references are.

Now consider this:

String[] array = {"hi", "there", "everyone"}

I'd imagine in this situation the Strings exist somewhere with all the other constants in memory, and then the array holds references to those constants in memory? So, again, in raw memory the array doesn't look like ['h', 'i', '\0', 't', 'h', 'e', 'r', 'e'... (etc)]. (using c-style termination just for convenience). Rather, it's more like ['a83a3edf' ,'a38decd' ... (etc)] where each element is a memory location (reference).

My conclusion from this thought process is that in Java, you can never ever imagine arrays as buckets of contiguous objects in memory, but rather as contiguous references. I can't think of any way to guarantee objects will always be stored contiguously in Java.

Now consider C:

char *a = "hi";
char *b = "there";
char *c = "everyone";
char *array[] = {a, b, c};

The code above is functionally equivalent to the Java above -- that is, the array holds references (pointers) to some other memory location. Like Java, the objects being pointed to aren't necessarily contiguous.

HOWEVER, in the following C code:

struct my_struct array[5];  // allocates 5 * size(my_struct) in memory! NOT room for 5
                            // references/pointers, but room for 5 my_structs.

The structs in array ARE contiguously located in raw memory.

Now for my concrete questions:

  1. Was I correct in my assumption that in Java, arrays must ALWAYS hold references, as the programmer only ever has access to references in Java? What about for raw data types? Will it work differently then? Will an array of ints in Java look just like one in C in raw memory (besides the Object class cruft Java will add)?

  2. In Java, is there no way for the programmer to guarantee contiguous memory allocation of objects? It might happen by chance, or with high probability, but the programmer can not GUARANTEE it will be so?

  3. In C, programmers CAN create raw arrays of objects (structs) contiguously in memory, as I have shown above, correct?

  4. How do other languages deal with this? I'm guessing Python works like Java?

The motivation for this question is that I want a solid understanding of what is happening at the raw memory level with arrays in these languages. Mostly for programmer-interview questions. I said in a previous interview that an array (not in any language, just in general) holds objects contiguously in memory like buckets. It was only after I said this that I realized that's not quite how it works in a language like Java. So I want to be 100% clear on it.

Thanks. Let me know if anything needs clarification.

like image 288
bob Avatar asked Sep 03 '15 17:09

bob


1 Answers

you can never ever imagine arrays as buckets of contiguous objects in memory, but rather as contiguous references.

In theory you are right, in practice, the JVM doesn't randomise memory access. It allocates memory sequentially and it copies objects during a GC in order of discovery (or reverse order)

Was I correct in my assumption that in Java, arrays must ALWAYS hold references, as the programmer only ever has access to references in Java?

Yes, Unless you have an array of primitives of course.

What about for raw data types? Will it work differently then?

Primitives and References are continuous in memory. They are basically the same.

Will an array of ints in Java look just like one in C in raw memory (besides the Object class cruft Java will add)?

yes.

In Java, is there no way for the programmer to guarantee contiguous memory allocation of objects?

Not unless you use off heap memory. Though generally this isn't as much of a problem as you might think as most of the time, the objects will be continuous in memory.

It might happen by chance, or with high probability, but the programmer can not GUARANTEE it will be so?

correct. Usually you have bigger problems when you look at the worst 0.1% latencies or above.

In C, programmers CAN create raw arrays of objects (structs) contiguously in memory, as I have shown above, correct?

yes. You can do it in Java as well, but you have to use off heap memory. There is a number of libraries which support this such as Javolution, Chronicle, SBE.

like image 125
Peter Lawrey Avatar answered Sep 28 '22 07:09

Peter Lawrey