Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding safe access of JNI arguments

I'm doing some research regarding how HotSpot performs garbage-collection and/or heap-compaction while JNI code is running.

It appears to be common knowledge that objects could be moved at any time in Java. I'm trying to understand, definitively if JNI is subject to effects garbage-collection. There exist a number of JNI functions to explicitly prevent garbage-collection; such as GetPrimitiveArrayCritical. It makes sense that such a function exists if the references are indeed volatile. However, it makes no sense if they are not.

There seems to be a substantial amount of conflicting information on this subject and I'm trying to sort it out.

JNI code runs in a safepoint and can continue running, unless it calls back into Java or calls some specific JVM methods, at which point it may be stopped to prevent leaving the safepoint (thanks Nitsan for the comments).

What mechanism JVM use to block threads during stop-the-world pause

The above makes me think that garbage-collection is going to run concurrently with JNI code. That can't be safe, right?

To implement local references, the Java VM creates a registry for each transition of control from Java to a native method. A registry maps nonmovable local references to Java objects, and keeps the objects from being garbage collected. All Java objects passed to the native method (including those that are returned as the results of JNI function calls) are automatically added to the registry. The registry is deleted after the native method returns, allowing all of its entries to be garbage collected.

https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html#wp16789

Okay, so the local references are nonmovable but that doesn't say anything about the compaction.

The JVM must ensure that objects passed as parameters from Java™ to the native method and any new objects created by the native code remain reachable by the GC. To handle the GC requirements, the JVM allocates a small region of specialized storage called a "local reference root set".

A local reference root set is created when:

  • A thread is first attached to the JVM (the "outermost" root set of the thread).
  • Each J2N transition occurs.

The JVM initializes the root set created for a J2N transition with:

  • A local reference to the caller's object or class.
  • A local reference to each object passed as a parameter to the native method.

New local references created in native code are added to this J2N root set, unless you create a new "local frame" using the PushLocalFrame JNI function.

http://www.ibm.com/support/knowledgecenter/en/SSYKE2_5.0.0/com.ibm.java.doc.diagnostics.50/diag/understanding/jni_transitions_j2n.html

Okay, so IBM stores the passed objects in the local reference root set but it doesn't discuss about memory compaction. This just says that the objects won't be garbage-collected.

The GC might, at any time, decide it needs to compact the garbage-collected heap. Compaction involves physically moving objects from one address to another. These objects might be referred to by a JNI local or global reference. To allow compaction to occur safely, JNI references are not direct pointers to the heap. At least one level of indirection isolates the native code from object movement.

If a native method needs to obtain direct addressability to the inside of an object, the situation is more complicated. The requirement to directly address, or pin, the heap is typical where there is a need for fast, shared access to large primitive arrays. An example might include a screen buffer. In these cases a JNI critical section can be used, which imposes additional requirements on the programmer, as specified in the JNI description for these functions. See the JNI specification for details.

  • GetPrimitiveArrayCritical returns the direct heap address of a Java™ array, disabling garbage collection until the corresponding ReleasePrimitiveArrayCritical is called.
  • GetStringCritical returns the direct heap address of a java.lang.String instance, disabling garbage collection until ReleaseStringCritical is called.

http://www.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/understanding/jni_copypin.html

Okay, so IBM basically says that the JNI passed objects COULD be moved at any time! How about HotSpot?

GetArrayElements family of functions are documented to either copy arrays, or pin them in place (and, in so doing, prevent a compacting garbage collector from moving them). It is documented as a safer, less-restrictive alternative to GetPrimitiveArrayCritical. However, I'd like to know which VMs and/or garbage collectors (if any) actually pin arrays instead of copying them.

Which VMs or GCs support JNI pinning?

Aleksandr seems to think that the only safe way to access the memory of passed objects is through Get<PrimitiveType>ArrayElements or GetPrimitiveArrayCritical

Trent's answer was less than exciting.

At least in current JVM's (i have not checked to see how far back this was backported), CMS GC, since it's non-moving is not affected by JNI critical sections (modulo that non stop-worl compaction can occur if there is a concurrent mode failure -- in that case the allocating thread must stall until the critical section is cleared -- this latter kind of stall is likely to be much rarer than the slow-path direct allocation in old gen pathology that you might see more frequently). Note that direct allocation in old gen is not only slow in and of itself (a first-order performance impact) but can in turn cause more tenuring (because of so-called nepotism), as well as slower subsequent scavenges because of more direty cards needing scanning (both of the latter being second-rder effects).

http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2007-December/000074.html

This email on the OpenJDK mailing list seems to say that the ConcurrentMarkAndSweep GC is non-moving.

https://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All

This post about G1 mentions that it does compact the heap but not much specifically about moving data.


Since the IBM documentation alludes to the fact that the objects could be compacted at any time; we need to figure out WHY the JNI HotSpot functions are actually safe at all. Right, because they must need to move to a safe state to prevent the concurrent memory effects if memory-compaction is indeed happening while the JNI code is running.

Now, I've been following the HotSpot code the best I can. Lets take a look at GetByteArrayElements. It seems logical that the method must ensure that the pointer is correct before copying the elements. Lets try to find out how.

Here is the macro for GetByteArrayElements

#ifndef USDT2
#define DEFINE_GETSCALARARRAYELEMENTS(ElementTag,ElementType,Result, Tag) 
JNI_QUICK_ENTRY(ElementType*,
          jni_Get##Result##ArrayElements(JNIEnv *env, ElementType##Array array, jboolean *isCopy))
  JNIWrapper("Get" XSTR(Result) "ArrayElements");
  DTRACE_PROBE3(hotspot_jni, Get##Result##ArrayElements__entry, env, array, isCopy);
  /* allocate an chunk of memory in c land */
  typeArrayOop a = typeArrayOop(JNIHandles::resolve_non_null(array));
  ElementType* result;
  int len = a->length();
  if (len == 0) {
    result = (ElementType*)get_bad_address();
  } else {
    result = NEW_C_HEAP_ARRAY_RETURN_NULL(ElementType, len, mtInternal);
    if (result != NULL) {                                    
          memcpy(result, a->Tag##_at_addr(0), sizeof(ElementType)*len);
      if (isCopy) {
        *isCopy = JNI_TRUE;
      }
    }  
  }
  DTRACE_PROBE1(hotspot_jni, Get##Result##ArrayElements__return, result);
  return result;
JNI_END

Here is the macro for JNI_QUICK_ENTRY

#define JNI_QUICK_ENTRY(result_type, header)                         \
extern "C" {                                                         \
  result_type JNICALL header {                                \
    JavaThread* thread=JavaThread::thread_from_jni_environment(env); \
    assert( !VerifyJNIEnvThread || (thread == Thread::current()), "JNIEnv is only valid in same thread"); \
    ThreadInVMfromNative __tiv(thread);                              \
    debug_only(VMNativeEntryWrapper __vew;)                          \
VM_QUICK_ENTRY_BASE(result_type, header, thread)

I have followed every function in here and yet have to see any kind of mutex or memory synchronizer. The only function I could not follow was __tiv which does not seem to have a definition anywhere I could find.

  • Could someone explain to me why JNI interface methods such as GetByteArrayElements are safe?
  • While we're at it, can anyone find where the JNI call transitions from VM back to Native when JNI_QUICK_ENTRY exits?
like image 685
Johnny V Avatar asked Sep 08 '16 01:09

Johnny V


2 Answers

How JNI methods work in HotSpot JVM

  1. Native methods may run concurrently with VM operations including GC. They are not stopped at safepoints.

  2. GC may move Java objects even if they are referenced from a running native method. jobject handle is not a raw address into the heap, but rather one more level of indirection: consider it a pointer into a non-movable array of object references. Whenever an object is moved, the corresponding array slot is updated, but the pointer to this slot remains the same. That is, jobject handle remains valid. Every time a native method calls a JNI function, it checks if JVM is in the safepoint state. If it is (e.g. GC is running), JNI function blocks until safepoint operation is completed.

  3. During the execution of JNI functions like GetByteArrayElements, the corresponding thread is marked as _thread_in_vm. A safepoint cannot be reached while there are running threads in this state. E.g. if GC is requested during the execution of GetByteArrayElements, GC will be delayed until JNI function returns.

  4. Thread state transition magic is performed by the line you've noticed:
    ThreadInVMfromNative __tiv(thread). Here __tiv is just an instance of the class. Its only purpose is to automatically call ThreadInVMfromNative constructor and destructor.

    ThreadInVMfromNative constructor calls transition_from_native which checks for a safepoint, and suspends current thread if needed. ~ThreadInVMfromNative destructor switches back to _thread_in_native state.

  5. GetPrimitiveArrayCritical and GetStringCritical are the only JNI functions that provide raw pointers to Java heap. They prevent GC from starting until the corresponding Release function is called.

Thread state transition when calling a JNI function from native code

  1. state = _thread_in_native;
    Native method may run concurrently with GC

  2. JNI function is called

  3. state = _thread_in_native_trans;
    GC cannot start at this point

  4. If VM operation is in progress, block until it completes

  5. state = _thread_in_vm;
    Safe to access heap

like image 169
apangin Avatar answered Nov 07 '22 03:11

apangin


It appears to be common knowledge that objects could be moved at any time.

It may be common, but it isn't knowledge, and it isn't true. Objects passed to or held by JNI methods are not available for movement until the method returns, or the object is explicitly released, or the LocalFrame containing it is popped.

If this was true, then every JNI interface method must require a lock or some kind of memory synchronization?

No, see above.

like image 33
user207421 Avatar answered Nov 07 '22 03:11

user207421