Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does an array object explicitly contain the indexes?

Tags:

java

arrays

Since day one of learning Java I've been told by various websites and many teachers that arrays are consecutive memory locations which can store the specified number of data all of the same type.

Since an array is an object and object references are stored on the stack, and actual objects live in the heap, object references point to actual objects.

But when I came across examples of how arrays are created in memory, they always show something like this:

(In which a reference to an array object is stored on the stack and that reference points to the actual object in the heap, where there are also explicit indexes pointing to specific memory locations)

Internal working of objects taught by various websites and teachers

But recently I came across online notes of Java in which they stated that arrays' explicit indexes are not specified in the memory. The compiler just knows where to go by looking at the provided array index number during runtime.

Just like this:

Enter image description here

After reading the notes, I also searched on Google regarding this matter, but the contents on this issue were either quite ambiguous or non-existent.

I need more clarification on this matter. Are array object indexes explicitly shown in memory or not? If not, then how does Java manage the commands to go to a particular location in an array during runtime?

like image 262
HQuser Avatar asked Dec 19 '15 07:12

HQuser


People also ask

Do arrays have indexes?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

How do you find the index of an object in an array?

To find the index of an object in an array, by a specific property: Use the map() method to iterate over the array, returning only the value of the relevant property. Call the indexOf() method on the returned from map array. The indexOf method returns the index of the first occurrence of a value in an array.

Is an array an indexed list?

An array is an ordered list of values that you refer to with a name and an index. For example, consider an array called emp , which contains employees' names indexed by their numerical employee number.

What is an index within an array?

The index indicates the position of the element within the array (starting from 1) and is either a number or a field containing a number.


2 Answers

In Java, arrays are objects. See the JLS - Chapter 10. Arrays:

In the Java programming language, arrays are objects (§4.3.1), are dynamically created, and may be assigned to variables of type Object (§4.3.2). All methods of class Object may be invoked on an array.

If you look at 10.7. Array Members chapter, you'll see that the index is not part of the array member:

The members of an array type are all of the following:

The public final field length, which contains the number of components of the array. length may be positive or zero.

  • The public method clone, which overrides the method of the same name in class Object and throws no checked exceptions. The return type of the clone method of an array type T[] is T[].

  • All the members inherited from class Object; the only method of Object that is not inherited is its clone method.

Since the size of each type is known, you can easily determine the location of each component of the array, given the first one.

The complexity of accessing an element is O(1) since it only needs to calculate the address offset. It's worth mentioning that this behavior is not assumed for all programming languages.

like image 24
Maroun Avatar answered Oct 11 '22 11:10

Maroun


Does an array object explicitly contain the indexes?

Short answer: No.

Longer answer: Typically not, but it theoretically could do.

Full answer:

Neither the Java Language Specification nor the Java Virtual Machine Specification makes any guarantees about how arrays are implemented internally. All it requires is that array elements are accessed by an int index number having a value from 0 to length-1. How an implementation actually fetches or stores the values of those indexed elements is a detail private to the implementation.

A perfectly conformant JVM could use a hash table to implement arrays. In that case, the elements would be non-consecutive, scattered around memory, and it would need to record the indexes of elements, to know what they are. Or it could send messages to a man on the moon who writes the array values down on labeled pieces of paper and stores them in lots of little filing cabinets. I can't see why a JVM would want to do these things, but it could.

What will happen in practice? A typical JVM will allocate the storage for array elements as a flat, contiguous chunk of memory. Locating a particular element is trivial: multiply the fixed memory size of each element by the index of the wanted element and add that to the memory address of the start of the array: (index * elementSize) + startOfArray. This means that the array storage consists of nothing but raw element values, consecutively, ordered by index. There is no purpose to also storing the index value with each element, because the element's address in memory implies its index, and vice-versa. However, I don't think the diagram you show was trying to say that it explicitly stored the indexes. The diagram is simply labeling the elements on the diagram so you know what they are.

The technique of using contiguous storage and calculating the address of an element by formula is simple and extremely quick. It also has very little memory overhead, assuming programs allocate their arrays only as big as they really need. Programs depend on and expect the particular performance characteristics of arrays, so a JVM that did something weird with array storage would probably perform poorly and be unpopular. So practical JVMs will be constrained to implement contiguous storage, or something that performs similarly.

I can think of only a couple of variations on that scheme that would ever be useful:

  1. Stack-allocated or register-allocated arrays: During optimization, a JVM might determine through escape analysis that an array is only used within one method, and if the array is also a smallish fixed size, it would then be an ideal candidate object for being allocated directly on the stack, calculating the address of elements relative to the stack pointer. If the array is extremely small (fixed size of maybe up to 4 elements), a JVM could go even further and store the elements directly in CPU registers, with all element accesses unrolled & hardcoded.

  2. Packed boolean arrays: The smallest directly addressable unit of memory on a computer is typically an 8-bit byte. That means if a JVM uses a byte for each boolean element, then boolean arrays waste 7 out of every 8 bits. It would use only 1 bit per element if booleans were packed together in memory. This packing isn't done typically because extracting individual bits of bytes is slower, and it needs special consideration to be safe with multithreading. However, packed boolean arrays might make perfect sense in some memory-constrained embedded devices.

Still, neither of those variations requires every element to store its own index.

I want to address a few other details you mentioned:

arrays store the specified number of data all of the same type

Correct.

The fact that all an array's elements are the same type is important because it means all the elements are the same size in memory. That's what allows for elements to be located by simply multiplying by their common size.

This is still technically true if the array element type is a reference type. Although in that case, the value of each element is not the object itself (which could be of varying size) but only an address which refers to an object. Also, in that case, the actual runtime type of objects referred to by each element of the array could be any subclass of the element type. E.g.,

Object[] a = new Object[4]; // array whose element type is Object // element 0 is a reference to a String (which is a subclass of Object) a[0] = "foo";  // element 1 is a reference to a Double (which is a subclass of Object) a[1] = 123.45;  // element 2 is the value null (no object! although null is still assignable to Object type) a[2] = null;  // element 3 is a reference to another array (all arrays classes are subclasses of Object) a[3] = new int[] { 2, 3, 5, 7, 11 }; 

arrays are consecutive memory locations

As discussed above, this doesn't have to be true, although it is almost surely true in practice.

To go further, note that although the JVM might allocate a contiguous chunk of memory from the operating system, that doesn't mean it ends up being contiguous in physical RAM. The OS can give programs a virtual address space that behaves as if contiguous, but with individual pages of memory scattered in various places, including physical RAM, swap files on disk, or regenerated as needed if their contents are known to be currently blank. Even to the extent that pages of the virtual memory space are resident in physical RAM, they could be arranged in physical RAM in an arbitrary order, with complex page tables that define the mapping from virtual to physical addresses. And even if the OS thinks it is dealing with "physical RAM", it still could be running in an emulator. There can be layers upon layers upon layers, is my point, and getting to the bottom of them all to find out what's really going on takes a while!

Part of the purpose of programming language specifications is to separate the apparent behavior from the implementation details. When programming you can often program to the specification alone, free from worrying about how it happens internally. The implementation details become relevant however, when you need to deal with the the real-world constraints of limited speed and memory.

Since an array is an object and object references are stored on the stack, and actual objects live in the heap, object references point to actual objects

This is correct, except what you said about the stack. Object references can be stored on the stack (as local variables), but they can also be stored as static fields or instance fields, or as array elements as seen in the example above.

Also, as I mentioned earlier, clever implementations can sometimes allocate objects directly on the stack or in CPU registers as an optimization, although this has zero effect on your program's apparent behavior, only its performance.

The compiler just knows where to go by looking at the provided array index number during runtime.

In Java, it's not the compiler that does this, but the virtual machine. Arrays are a feature of the JVM itself, so the compiler can translate your source code that uses arrays simply to bytecode that uses arrays. Then it's the JVM's job to decide how to implement arrays, and the compiler neither knows nor cares how they work.

like image 157
Boann Avatar answered Oct 11 '22 11:10

Boann