Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nim: Addresses of parameters and mutability

I'm trying to make up my mind about Nim's policy behind expression has no address. In particular, I have a C function which takes a pointer (+ length etc.) of some data buffer. I know that this function will not modify the data. Simplified:

type
  Buffer = object
    data: seq[float]

proc wrapperForCCall(buf: Buffer) =
  # accessing either buf.addr nor buf.data.addr produces
  # Error: expression has no address
  # workaround:
  var tmp = buf.data          # costly copy
  callToC(tmp.len, tmp.addr)  # now it works

On the one hand this makes sense, since a parameter seems to behave exactly like a let binding, which also "has no address". On the other hand, I'm puzzled by this statement in the manual:

var parameters are never necessary for efficient parameter passing.

As far as I can see, the only way to avoid copying the data is by either:

  • passing the parameter as buf: var Buffer
  • passing a reference, i.e., using a ref object.

In both cases this suggests that my function modifies the data. Furthermore, it introduces mutability on the caller site (i.e. users can no longer use let bindings for their buffers). The key question for me is: Since "I know" that callToC is read-only, can I convince Nim to allow both immutability without a copy? I see that this is dangerous, since I have to know for sure that the call is immutable. Thus, this would require some sort of "unsafe address" mechanism, allowing to force pointers to immutable data?

And my final mystery of parameter addresses: I tried to make the necessity of the copy explicit by changing the type to Buffer {.bycopy.} = object. In this case the copy already happens at call time, and I would expect to have access to the address now. Why is the access denied in this case as well?

like image 626
bluenote10 Avatar asked May 08 '15 20:05

bluenote10


3 Answers

Nim now has an unsafeAddr operator, which allows to get addresses even for let bindings and parameters, allowing to avoid the shallowCopy workaround. Obviously one has to be very careful that nothing mutates the data behind the pointer.

like image 193
bluenote10 Avatar answered Oct 23 '22 08:10

bluenote10


You can avoid the deep copy of buf.data by using shallowCopy, e.g.:

var tmp: seq[float]
shallowCopy tmp, buf.data

The {.byCopy.} pragma only affects the calling convention (i.e. whether an object gets passed on the stack or via a reference.

You cannot take the address of buf or any part of it that isn't behind a ref or ptr because passing a value as a non-var parameter is a promise that the callee does not modify the argument. The shallowCopy builtin is an unsafe feature that circumvents that guarantee (I remember suggesting that shallowCopy should properly be renamed to unsafeShallowCopy to reflect that and to have a new shallowCopy where the second argument is a var parameter also).

like image 7
Reimer Behrends Avatar answered Oct 23 '22 08:10

Reimer Behrends


Let's start by clarifying the following:

var parameters are never necessary for efficient parameter passing.

This is generally true, because in Nim complex values like objects, sequences and strings will be passed by address (a.k.a. by reference) to procs accepting read-only parameters.

When you need to pass a sequence to an external C/C++ function, things get a bit more complicated. The most common way to do this is to rely on the openarray type, which will automatically convert the sequence to a pair of data pointer and a size integer:

# Let's say we have the following C function:

{.emit: """

#include <stdio.h>

void c_call_with_size(double *data, size_t len)
{
  printf("first value: %f; size: %d \n" , data[0], len);
}

""".}

# We can import it like this:

proc c_call(data: openarray[float]) {.importc: "c_call_with_size", nodecl.}

# The usage is straight-forward:

type Buffer = object
  data: seq[float]

var b = Buffer(data: @[1.0, 2.0])

c_call(b.d)

There won't be any copies in the generated C code.

Now, if the wrapped C library doesn't accept a pair of data/size arguments as in the example here, I'd suggest creating a tiny C wrapper around it (you can create a header file or just use the emit pragma to create the necessary adapter functions or #defines).

Alternatively, if you really want to get your hands dirty, you can extract the underlying buffer from the sequence with the following helper proc:

proc rawBuffer[T](s: seq[T]): ptr T =
  {.emit: "result = `s`->data;".}

Then, it will be possible to pass the raw buffer to C like this:

{.emit: """

#include <stdio.h>

void c_call(double *data)
{
  printf("first value: %f \n", data[0]);
}

""".}

proc c_call(data: ptr float) {.importc: "c_call", nodecl.}

var b = Buffer(data: @[1.0, 2.0])
c_call(b.data.rawBuffer)
like image 6
zah Avatar answered Oct 23 '22 09:10

zah