OpenJDK implementation of System.arraycopy

Tags:

Following a question related to the way the JVM implements creation of Strings based on char[], I have mentioned that no iteration takes place when the char[] gets copied to the interior of the new string, since System.arraycopy gets called eventually, which copies the desired memory using a function such as memcpy at a native, implementation-dependent level (the original question).

I wanted to check that for myself, so I downloaded the Openjdk 7 source code and started browsing it. I found the implementation of System.arraycopy in the OpenJDK C++ source code, in openjdx/hotspot/src/share/vm/oops/objArrayKlass.cpp:

if (stype == bound || Klass::cast(stype)->is_subtype_of(bound)) {
  // elements are guaranteed to be subtypes, so no check necessary
  bs->write_ref_array_pre(dst, length);
  Copy::conjoint_oops_atomic(src, dst, length);
} else {
  // slow case: need individual subtype checks

If the elements need no type checks (that's the case with, for instance, primitive data type arrays), Copy::conjoin_oops_atomic gets called.

The Copy::conjoint_oops_atomic function resides in 'copy.hpp':

// overloaded for UseCompressedOops
static void conjoint_oops_atomic(narrowOop* from, narrowOop* to, size_t count) {
  assert(sizeof(narrowOop) == sizeof(jint), "this cast is wrong");
  assert_params_ok(from, to, LogBytesPerInt);
  pd_conjoint_jints_atomic((jint*)from, (jint*)to, count);
}

Now we're platform dependent, as the copy operation has a different implementation, based on OS/architecture. I'll go with Windows as an example. openjdk\hotspot\src\os_cpu\windows_x86\vm\copy_windows_x86.inline.hpp:

static void pd_conjoint_oops_atomic(oop* from, oop* to, size_t count) {
// Do better than this: inline memmove body  NEEDS CLEANUP
if (from > to) {
  while (count-- > 0) {
    // Copy forwards
    *to++ = *from++;
  }
} else {
  from += count - 1;
  to   += count - 1;
  while (count-- > 0) {
    // Copy backwards
    *to-- = *from--;
  }
 }
}

And... to my surprise, it iterates through the elements (the oop values), copying them one by one (seemingly). Can someone explain why the copy is done, even at the native level, by iterating through the elements in the array?

842

asked Jun 26 '12 15:06

Andrei Bârsan

1 Answers

Because the jint most closely maps to int which most closely maps to the old hardware architecture WORD, which is basically the same size as the width of the data bus.

The memory architectures and cpu processing of today are designed to attempt processing even in the event of a cache miss, and memory locations tend to pre-fetch blocks. The code that you are looking at isn't quite as "bad" in performance as you might think. The hardware is smarter, and if you don't actually profile, your "smart" fetching routines might actually add nothing (or even slow down processing).

When you are introduced to hardware architectures, you must be introduced to simple ones. Modern ones do a lot more, so you can't assume that code that looks inefficient is actually inefficient. For example, when a memory lookup is done to evaluate the condition on an if statement, often both branches of the if statement are executed while the lookup is occurring, and the "false" branch of processing is discarded after the data becomes available to evaluate the condition. If you want to be efficient, you must profile and then act on the profiled data.

Look at the branch on JVM opcode section. You'll see it is (or perhaps, just was) an ifdef macro oddity to support (at one time) three different ways of jumping to the code that handled the opcode. That was because the three different ways actually made a meaningful performance difference on the different Windows, Linux, and Solaris architectures.

Perhaps they could have included MMX routines, but that they didn't tells me that SUN didn't think it was enough of a performance gain on modern hardware to worry about it.

160

answered Sep 29 '22 17:09

Edwin Buck

Related questions
                            
                                How to add a property to a module in boost::python?
                            
                                Open source portable/cross-platform video camera capture library [closed]
                            
                                Status & Contents of TR2 W.R.T. C++ Specification
                            
                                /usr/bin/ld: warning: abc.so, needed by xyz.so not found (try using -rpath or -rpath-link)"
                            
                                Partitioning big rectangle to small ones (2D Packing)
                            
                                Any papers that explore performance issues and optimizations strategies available to C++ based COM applications?
                            
                                Lazy Parameter Evaluation
                            
                                Add member to existing struct without breaking legacy code
                            
                                algorithm to parse string with dictionary
                            
                                NEON vs Intel SSE - equivalence of certain operations
                            
                                Translating source code into a foreign language
                            
                                Debug stack corruption
                            
                                Printing full backtrace in c++
                            
                                Why doesn't this overloading/namespace/template-related C++ code compile?
                            
                                Efficient Array Reallocation in C++
                            
                                g++ optimization options affect the value of sin function
                            
                                Must an unused volatile parameter be honoured?
                            
                                Number of async/futures in C++11
                            
                                Auto-generate stream operator for struct/class
                            
                                Does using namespace cause name hiding?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

OpenJDK implementation of System.arraycopy

Tags:

java

c++

jvm

openjdk

Andrei Bârsan

People also ask

1 Answers

Edwin Buck

Recent Activity

Donate For Us