Does the string returned from the <code>GetStringUTFChars()</code> end with a null terminated character? Or do I need to determine the length using <code>GetStringUTFLength</code> and null terminate it myself?

Yes, <code>GetStringUTFChars</code> returns a null-terminated string. However, I don't think you should take my word for it, instead you should find an authoritative online source that answers this question. Let's start with the actual Java Native Interface Specification itself, where it says: <blockquote> Returns a pointer to an array of bytes representing the string in modified UTF-8 encoding. This array is valid until it is released by <code>ReleaseStringUTFChars()</code>. </blockquote> Oh, surprisingly it doesn't say whether it's null-terminated or not. Boy, that seems like a huge oversight, and fortunately somebody was kind enough to log this bug on Sun's Java bug database back in 2008. The notes on the bug point you to a similar but different documentation bug (which was closed without action), which suggests that the readers buy a book, "The Java Native Interface: Programmer's Guide and Specification" as there's a suggestion that this become the new specification for JNI. But we're looking for an authoritative online source, and this is neither authoritative (it's not yet the specification) nor online. Fortunately, the reviews for said book on a certain popular online book retailer suggest that the book is freely available online from Sun, and that would at least satisfy the online portion. Sun's JNI web page has a link that looks tantalizingly close, but that link sadly doesn't go where it says it goes. So I'm afraid I cannot point you to an authoritative online source for this, and you'll have to buy the book (it's actually a good book), where it will explain to you that: <blockquote> UTF-8 strings are always terminated with the <code>'\0'</code> character, whereas Unicode strings are not. To find out how many bytes are needed to represent a <code>jstring</code> in the UTF-8 format, JNI programmers can either call the ANSI C function <code>strlen</code> on the result of <code>GetStringUTFChars</code>, or call the JNI function <code>GetStringUTFLength</code> on the <code>jstring</code> reference directly. </blockquote> (Note that in the above sentence, "Unicode" means "UTF-16", or more accurately "the internal two-byte string representation used by Java, though finding proof of that is left as an exercise for the reader.)

All current answers to the question seem to be outdated (Edward Thomson's answer last update dates back to 2015), or referring to Android JNI documentation which can be authoritative only in the Android world. The matter has been clarified in recent (2017) official Oracle JNI documentation clean-up and updates, more specifically in this issue. Now the JNI specification clearly states: <blockquote> String Operations This specification makes no assumptions on how a JVM represent Java strings internally. Strings returned from these operations: <ul> <li>GetStringChars()</li> <li>GetStringUTFChars()</li> <li>GetStringRegion()</li> <li>GetStringUTFRegion()</li> <li>GetStringCritical()</li> </ul> are therefore not required to be NULL terminated. Programmers are expected to determine buffer capacity requirements via GetStringLength() or GetStringUTFLength(). </blockquote> In the general case this means one should never assume JNI returned strings are null terminated, not even UTF-8 strings. In a pragmatic world one can test a specific behavior in a list of supported JVM(s). In my experience, rereferring to JVMs I actually tested: <ul> <li>Oracle JVMs do null terminate both UTF-16 (with <code>\u0000</code>) and UTF-8 strings (with <code>'\0'</code>);</li> <li>Android JVMs do terminate UTF-8 strings but not UTF-16 ones.</li> </ul>

Java native code string ending

2 Answers

Yes, GetStringUTFChars returns a null-terminated string. However, I don't think you should take my word for it, instead you should find an authoritative online source that answers this question.

Let's start with the actual Java Native Interface Specification itself, where it says:

Returns a pointer to an array of bytes representing the string in modified UTF-8 encoding. This array is valid until it is released by ReleaseStringUTFChars().

Oh, surprisingly it doesn't say whether it's null-terminated or not. Boy, that seems like a huge oversight, and fortunately somebody was kind enough to log this bug on Sun's Java bug database back in 2008. The notes on the bug point you to a similar but different documentation bug (which was closed without action), which suggests that the readers buy a book, "The Java Native Interface: Programmer's Guide and Specification" as there's a suggestion that this become the new specification for JNI.

But we're looking for an authoritative online source, and this is neither authoritative (it's not yet the specification) nor online.

Fortunately, the reviews for said book on a certain popular online book retailer suggest that the book is freely available online from Sun, and that would at least satisfy the online portion. Sun's JNI web page has a link that looks tantalizingly close, but that link sadly doesn't go where it says it goes.

So I'm afraid I cannot point you to an authoritative online source for this, and you'll have to buy the book (it's actually a good book), where it will explain to you that:

UTF-8 strings are always terminated with the '\0' character, whereas Unicode strings are not. To find out how many bytes are needed to represent a jstring in the UTF-8 format, JNI programmers can either call the ANSI C function strlen on the result of GetStringUTFChars, or call the JNI function GetStringUTFLength on the jstring reference directly.

(Note that in the above sentence, "Unicode" means "UTF-16", or more accurately "the internal two-byte string representation used by Java, though finding proof of that is left as an exercise for the reader.)

195

answered Oct 16 '22 21:10

Edward Thomson

All current answers to the question seem to be outdated (Edward Thomson's answer last update dates back to 2015), or referring to Android JNI documentation which can be authoritative only in the Android world. The matter has been clarified in recent (2017) official Oracle JNI documentation clean-up and updates, more specifically in this issue.

Now the JNI specification clearly states:

String Operations

This specification makes no assumptions on how a JVM represent Java strings internally. Strings returned from these operations:

GetStringChars()

GetStringUTFChars()

GetStringRegion()

GetStringUTFRegion()

GetStringCritical()

are therefore not required to be NULL terminated. Programmers are expected to determine buffer capacity requirements via GetStringLength() or GetStringUTFLength().

In the general case this means one should never assume JNI returned strings are null terminated, not even UTF-8 strings. In a pragmatic world one can test a specific behavior in a list of supported JVM(s). In my experience, rereferring to JVMs I actually tested:

Oracle JVMs do null terminate both UTF-16 (with \u0000) and UTF-8 strings (with '\0');
Android JVMs do terminate UTF-8 strings but not UTF-16 ones.

answered Oct 16 '22 21:10

ceztko

Related questions
                            
                                Error when compiling in cygwin -- error: unknown type name '_int64' -- (jni.h)
                            
                                How to create jni header file with IntelliJ IDEA
                            
                                return byte array in jni android?
                            
                                Retrieve album art using FFmpeg
                            
                                Javah Error android.app.Activity not found
                            
                                JNI- FindClass function returns null
                            
                                android - javah doesn't find my class
                            
                                C++ jump to other method execution
                            
                                Alternative Java Selector Implementations
                            
                                How to pass a complex structure between C and Java with JNI on Android NDK
                            
                                How some apps track their own uninstall on android
                            
                                Calling C# method within a Java program
                            
                                How to debug SEGV_ACCERR
                            
                                Defaults for Eclipse run configurations
                            
                                JNI Hello World Unsatisfied Link Error
                            
                                How can I effectively debug C code that's wrapped with JNI in Eclipse? (Android Dev)
                            
                                Java: InputStream too slow to read huge files
                            
                                Sending int[]s between Java and C
                            
                                Android NDK overflows dalvik JNI local reference table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Java native code string ending

Tags:

java-native-interface

Goozo

People also ask

2 Answers

Edward Thomson

ceztko

Recent Activity

Donate For Us