Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I modify a C-style array as D-style array?

Question

What's the optimal way to accept a C-style array as a parameter, modify it as a D-style array (including changing the length), and return it as a C-style array?


In Context

I'm writing a library in D, which compiles to a DLL with a C interface (I'll be calling my D DLL from C++, so the C interface is necessary). It takes byte arrays and modifies their contents, sometimes changing the array length.

Because I'm using the C interface, my functions have to accept arrays as C-style. Ideally, I'd like to be able to to allocate more memory (ie. extend bufferMaxSize) if the given buffer is too small.

This is how my D DLL accepts parameters right now:

// D library code; compiles to DLL with C interface.
// bufferSize is the data length, and is a pointer because I may modify the data length.
// bufferMaxSize is the total allocated buffer size.
export extern(C) void patchData(const size_t bufferMaxSize, size_t * bufferSize, byte * buffer) { ... }

In my D library, I have existing code that accepts D-style arrays. Somewhere along the line, there has to be a conversion of the C-style array to D-style array.

I'm currently doing the conversion like this (simplified example):

// D library code; compiles to DLL with C interface.
export extern(C) void patchData(const size_t bufferMaxSize, size_t * bufferSize, byte * buffer) {
    // Convert from C-style array to D-style.
    byte[] dStyleArray = buffer[0 .. *bufferSize];

    // Modify data.
    dStyleArray[0] = cast(byte) 0xab;
    dStyleArray[1] = cast(byte) 0xbc;

    dStyleArray.length = dStyleArray.length + 16;

    // Return modified data as C-style array.
    buffer[0 .. dStyleArray.length] = dStyleArray[0 .. dStyleArray.length];
    *bufferSize = dStyleArray.length;
}

It works, but I'm unsure of what's really happening here. My main concern is speed. If I'm looping over this function, I don't want to be constantly allocating new memory and copying its contents back and forth.

When I do byte[] dStyleArray = buffer[0 .. *bufferSize], is D allocating a new chunk of memory and copying everything into the D-style array, or is it pointing to the already-allocated C-style array?

What's going on when I do dStyleArray.length = dStyleArray.length + 16? Since dStyleArray was sliced from buffer, am I allocating new memory/copying memory now? Or did I extend into buffer?

When I do buffer[0 .. dStyleArray.length] = dStyleArray[0 .. dStyleArray.length];, I am copying memory, right?

Is it possible to just "bind" a D-style array to a C-style array, and access pre-allocated memory through the D-style array's interface?

like image 675
sorbet Avatar asked Oct 19 '15 22:10

sorbet


People also ask

How do you modify an array in C++?

In order to modify a value of an array, you reference the array element by the array name and index location and then use the equals operator to set the value to what you want it to change to. We change both elements of the above 2-item array. The line, test_scores[0]= 82;, changes the first element to 82.

Can an array in C have different types?

Array in C are of two types; Single dimensional arrays and Multidimensional arrays. Single Dimensional Arrays: Single dimensional array or 1-D array is the simplest form of arrays that can be found in C. This type of array consists of elements of similar types and these elements can be accessed through their indices.

How do you any style is an array in C?

A C-style array is nothing more than a block of memory that can be interpreted as an array; it is not a defined data type. Other options are available in class libraries. Arrays must be declared by type and either by size or by some indication of the number of dimensions. const int N=10; float z[N];

Can array size be changed in C?

Arrays are static so you won't be able to change it's size. You'll need to create the linked list data structure.


2 Answers

When I do byte[] dStyleArray = buffer[0 .. *bufferSize], is D allocating a new chunk of memory and copying everything into the D-style array, or is it pointing to the already-allocated C-style array?

It's pointing. Phobos use the same trick to convert a C "string" to a D one: https://github.com/D-Programming-Language/phobos/blob/67c95e6de21d5d627e3c57128b4d6e332c82f785/std/string.d#L208-L211

What's going on when I do dStyleArray.length = dStyleArray.length + 16? Since dStyleArray was sliced from buffer, am I allocating new memory/copying memory now? Or did I extend into buffer?

That's probably not doing what you want / expect. It will allocate a new block on garbage collected memory, and copy the content to it. There is no way it can extend it because the runtime doesn't have any information about the block of memory (it is not managing it). Are you really looking to extend the buffer, or to move the pointer (which would be slicing in D) ?

When I do buffer[0 .. dStyleArray.length] = dStyleArray[0 .. dStyleArray.length];, I am copying memory, right?

Yes. That's lowered to a memcpy.

Is it possible to just "bind" a D-style array to a C-style array, and access pre-allocated memory through the D-style array's interface?

Yes, that's what you did at the beginning ;)

If you just want to change the first 2 elements of the array, just do the binding and change them, it will "just work".

If you want to test the behaviour, I'd recommend you put an unittest block below the function, so you can test what happens by giving it a pointer. Also, if you want to make sure you are not doing any GC allocation, you might want to consider putting @nogc on your function to statically check for that (and nothrow is usually a good idea as well for C function).

like image 56
Geod24 Avatar answered Dec 04 '22 20:12

Geod24


When I do byte[] dStyleArray = buffer[0 .. *bufferSize], is D allocating a new chunk of memory and copying everything into the D-style array, or is it pointing to the already-allocated C-style array?

It is just pointing to it. The slice operator on the right hand side is the same (conceptually) as array.pointer = &first_element; array._length = length; - a very quick and simple operation. (I called it _length instead of length btw because setting length property actually may call a function, that's next.)

What's going on when I do dStyleArray.length = dStyleArray.length + 16?

That will allocate new memory. When length is extended, unless the runtime can prove it is safe (or you tell it to assume it is and it knows it came from the GC), the array is copied to a new location. It basically calls realloc() on the pointer - though not literally, it isn't compatible with C realloc.

Since it came from C, the runtime just knows it doesn't own the memory, that it is managed elsewhere somehow, and will always allocate a new one when trying to extend. If you want to extend by some other means, you need to do it yourself.

When I do buffer[0 .. dStyleArray.length] = dStyleArray[0 .. dStyleArray.length];, I am copying memory, right?

right, that does copy because you sliced on the left hand side.

Is it possible to just "bind" a D-style array to a C-style array, and access pre-allocated memory through the D-style array's interface?

The plain right hand slice:

auto d_array = c_array[0 .. c_array_length];

handles it for everything except length extensions. It keeps the pointer so writing to elements will instantly affect the original thing. (BTW since it is shared C memory, make sure you don't free it while D is still using it! You should be fine as long as you only use it inside this one function though, without storing the slice anywhere.)

If you do need to extend the length, you need to do it yourself. The way I like to do it is to slice the whole potential array, full capacity, then slice that again to get a limited capacity window.

So like maybe:

auto whole_array = buffer[0 .. bufferMaxSize]; // assuming buffer is already fully allocated on the C side
auto part_youre_using = whole_array[0 .. *bufferSize];

// to extend:
*bufferSize += 16; // extend the size
part_your_using = whole_array[0 .. *bufferSize]; // and reslice from the original

The reason I made the whole_array thing instead of reslicing buffer is so D can catch bounds violations for me. It doesn't do that on a naked pointer, but does on a sliced pointer since it knows the max size as its length.

If you need to extend the buffer though, do it with the right C function, like realloc or whatever, then slice out the whole_array and part_youre_using again.

like image 43
Adam D. Ruppe Avatar answered Dec 04 '22 19:12

Adam D. Ruppe