Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to interact with arrays using SWIG

Tags:

java

c

swig

I'm a bit lost with typemaps in swig and how to use arrays. I have prepared a working example that uses arrays between java and c using swig, but i don't know if it is the correct way to do it.

Basically i want to pass a byte array byte[] from java to c as a ´signed char *` + it's size, modify it in c and see the changes in java and create an array in c and use it in Java.

I have take a look at theese questions: How to pass array(array of long in java) from Java to C++ using Swig, Pass an array to a wrapped function as pointer+size or range, How can I make Swig correctly wrap a char* buffer that is modified in C as a Java Something-or-other?

And in fact used the solutions as a guide to make the example.

This is my code in the file arrays.h:

#include <iostream>

bool createArray(signed char ** arrCA, int * lCA){
    *lCA = 10;
    *arrCA = (signed char*) calloc(*lCA, sizeof(signed char));

    for(int i = 0; i < *lCA; i++){
        (*arrCA)[i] = i;
    }

    return *arrCA != NULL;
}

bool readArray(const signed char arrRA[], const int lRA){
    for(int i = 0; i < lRA; i++){
        std::cout << ((unsigned int) arrRA[i]) << " ";
    }
    std::cout << std::endl;
    return true;
}

bool modifyArrayValues(signed char arrMA[], const int lMA){
    for(int i = 0; i < lMA; i++){
        arrMA[i] = arrMA[i] * 2;
    }
    return true;
}


bool modifyArrayLength(signed char arrMALIn[], int lMALIn, signed char ** arrMALOut, int * lMALOut){

    *lMALOut = 5;
    *arrMALOut = (signed char*) calloc(*lMALOut, sizeof(signed char));

    for(int i = 0; i < *lMALOut; i++){
        (*arrMALOut)[i] = arrMALIn[i];
    }
    return true;
}

This is the .i file for swig (arrays.i):

%module arrays

%{
    #include "arrays.h"
%}

%typemap(jtype) bool createArray "byte[]"
%typemap(jstype) bool createArray "byte[]"
%typemap(jni) bool createArray "jbyteArray"
%typemap(javaout) bool createArray { return $jnicall; }
%typemap(in, numinputs=0) signed char ** arrCA (signed char * temp) "$1=&temp;"
%typemap(in, numinputs=0) int * lCA (int l) "$1=&l;"
%typemap(argout) (signed char ** arrCA, int * lCA) {
    $result = JCALL1(NewByteArray, jenv, *$2);
    JCALL4(SetByteArrayRegion, jenv, $result, 0, *$2, (const jbyte*) *$1);
}
%typemap(out) bool createArray {
    if (!$1) {
        return NULL;
    }
}


%typemap(jtype) (const signed char arrRA[], const int lRA) "byte[]"
%typemap(jstype) (const signed char arrRA[], const int lRA) "byte[]"
%typemap(jni) (const signed char arrRA[], const int lRA) "jbyteArray"
%typemap(javain) (const signed char arrRA[], const int lRA) "$javainput"

%typemap(in,numinputs=1) (const signed char arrRA[], const int lRA) {
  $1 = JCALL2(GetByteArrayElements, jenv, $input, NULL);
  $2 = JCALL1(GetArrayLength, jenv, $input);
}

%typemap(freearg) (const signed char arrRA[], const int lRA) {
  // Or use  0 instead of ABORT to keep changes if it was a copy
  JCALL3(ReleaseByteArrayElements, jenv, $input, $1, JNI_ABORT); 
}


%typemap(jtype) (signed char arrMA[], const int lMA) "byte[]"
%typemap(jstype) (signed char arrMA[], const int lMA) "byte[]"
%typemap(jni) (signed char arrMA[], const int lMA) "jbyteArray"
%typemap(javain) (signed char arrMA[], const int lMA) "$javainput"

%typemap(in, numinputs=1) (signed char arrMA[], const int lMA) {
    $1 = JCALL2(GetByteArrayElements, jenv, $input, NULL);
    $2 = JCALL1(GetArrayLength, jenv, $input);
}

%typemap(freearg) (signed char arrMA[], const int lMA) {
  JCALL3(ReleaseByteArrayElements, jenv, $input, $1, 0); 
} 

%typemap(jtype) (signed char arrMALIn[], int lMALIn) "byte[]"
%typemap(jstype) (signed char arrMALIn[], int lMALIn) "byte[]"
%typemap(jni) (signed char arrMALIn[], int lMALIn) "jbyteArray"
%typemap(javain) (signed char arrMALIn[], int lMALIn) "$javainput"

%typemap(in, numinputs=1) (signed char arrMALIn[], int lMALIn) {
    $1 = JCALL2(GetByteArrayElements, jenv, $input, NULL);
    $2 = JCALL1(GetArrayLength, jenv, $input);
}

%typemap(freearg) (signed char arrMALIn[], int lMALIn) {
    JCALL3(ReleaseByteArrayElements, jenv, $input, $1, JNI_ABORT); 
}

%typemap(jtype) bool modifyArrayLength "byte[]"
%typemap(jstype) bool modifyArrayLength "byte[]"
%typemap(jni) bool modifyArrayLength "jbyteArray"
%typemap(javaout) bool modifyArrayLength { return $jnicall; }
%typemap(in, numinputs=0) signed char ** arrMALOut (signed char * temp) "$1=&temp;"
%typemap(in, numinputs=0) int * lMALOut (int l) "$1=&l;"
%typemap(argout) (signed char ** arrMALOut, int * lMALOut) {
    $result = JCALL1(NewByteArray, jenv, *$2);
    JCALL4(SetByteArrayRegion, jenv, $result, 0, *$2, (const jbyte*) *$1);
}
%typemap(out) bool modifyArrayLength {
    if (!$1) {
        return NULL;
    }
}


%include "arrays.h"

And finally the Java code to test it:

public class Run{

    static {
        System.loadLibrary("Arrays");
    }

    public static void main(String[] args){

        byte[] test = arrays.createArray();

        printArray(test);       

        arrays.readArray(test);

        arrays.modifyArrayValues(test);

        printArray(test);

        byte[] test2 = arrays.modifyArrayLength(test);

        printArray(test2);

    }

    private static void printArray(byte[] arr){

        System.out.println("Array ref: " + arr);

        if(arr != null){
            System.out.println("Array length: " + arr.length);

            System.out.print("Arrays items: ");

            for(int i =0; i < arr.length; i++){
                System.out.print(arr[i] + " ");
            }
        }
        System.out.println();
    }
}

The example works, but I'm not sure that that is the correct way, i mean:

is there an easier way to achieve the same result?

does this code have memory leaks (on one hand i think there is because i do a calloc but i don't free it, but on the other hand i pass it to the SetByteArrayRegion, so maybe freeing it would cause an error)?

does the SetByteArrayRegion copy the values or only the reference?, for example if instead of actually doing a calloc what if obtaining an array from an c++ object by reference that is going to be destroy when it exits scope?

is the array returned to Java correctly freed when nullifying it?

is there a way to specify from where to where a typemap applies?, i mean, in the .i code i have provided a typemap for each function, where i think i could reuse some of them, but if there were others functions with same parameters that i don't want to typemap them, how can i do that, i may not be able to modify the parameters name of the functions.

I have seen the carrays.i possibility described in this question How do I pass arrays from Java to C++ using Swig?, but that implies that if the size of the array is 1000 items and i want to send it through a Java Socket or create a String from it, i have to make 1 JNI Call for each array item. And i actually want a byte[] in the Java side, not a set of functions to access the underlaying array, so already existing code works without modifications.


Context: The reason i want to achieve this is that there is a library that have some functionality, but the important part here is that it allows to import and export data from the library making use of the Google Protocols Buffers. So the code related to this question looks like this:

class SomeLibrary {

  bool export(const std::string & sName, std::string & toExport);

  bool import(const std::string & sName, const std::string & toImport);

}

The thing is that Protobuf in C++ uses std::string to store the data, but this data is binary so it can not be returned as a normal Java String because it gets truncated, more of this in Swig: convert return type std::string(binary) to java byte[].

So my idea is to return to Java a byte[] for the serialized Protobuf (as does the Java version of Protocol buffers) and accept byte[] for parsing protobufs. To avoid getting SWIGTYPE_p_std_string in the second argument of the export, and having String for the second argument of import y have wrapped both functions using %extend, like this:

%extend SomeLibrary{

  bool export(const std::string & sName, char ** toExportData, int * toExportLength);

  bool import(const std::string & sName, char * toImportData, int toImportLength);

}

And now i should be able to make the typemaps.

But in order to be more general, i asked for the general of manipulating arrays from Java to SWIG, having the native Java byte[].

like image 812
Javier Mr Avatar asked Sep 19 '12 14:09

Javier Mr


1 Answers

Don't discount carrays.i automatically. That said SWIG has some convenient typemaps already:

%module test

%apply(char *STRING, size_t LENGTH) { (char *str, size_t len) };

%inline %{
void some_func(char *str, size_t len) {
}
%}

Which produces a function in the Java interface:

public static void some_func(byte[] str)

i.e. it takes an array you can build in Java like normal and fills in the pointer and length for you. Almost for free.

Your code as it stands almost certainly leaks - you'd want to call free() within the argout typemap to release the memory you allocated once it's been copied into the new Java array.

You can selectively apply typemaps by both the type and the name of the parameters. See this document for more on typemap matching rules. You can also request to explicitly use a typemap where it wouldn't otherwise be used with %apply as in the example I showed above. (Actually it copies the typemaps, so that if you modified just one of them it doesn't replace it in the general case)

In general the typemaps for passing arrays from Java to C++ or working with arrays of known size are simpler than ones for returning from C++ to Java because the size information is more obvious.

My suggestion would be to plan on doing a lot of the allocation inside Java the allocation and designing your functions that might grow an array to operate in two modes: one that indicates the size needed and one that actually does the work. You might do that with:

ssize_t some_function(char *in, size_t in_sz) {
  if (in_sz < the_size_I_need) {
    return the_size_I_need; // query the size is pretty fast
  }

  // do some work on in if it's big enough

  // use negative sizes or exceptions to indicate errors

  return the_size_I_really_used; // send the real size back to Java
}

That would allow you to do something like the following in Java:

int sz = module.some_function(new byte[0]);
byte result[] = new byte[sz];
sz = module.some_function(result);

Note that with the default typemaps the new byte[0] is needed because they don't allow null to be used as an array - you could add typemaps that allow this if you wanted, or use %extend to provide an overload that didn't need an empty array.

like image 170
Flexo Avatar answered Oct 18 '22 03:10

Flexo