C: Returning a void versus returning a double * from a subfunction

Question

I'm working on trying to speed up some general data processing in C. I've written several subroutines of the form:

double *do_something(double *arr_in, ...) {
   double *arr_out; 
   arr_out = malloc(...)

   for (...) {
     do the something on arr_in and put into arr_out
   }

   return arr_out; 
}

I like this style because it's easy to read and use, but often I call it as:

 array = do_something(array,...);

Would it make for faster code (and maybe prevent memory leaks) if I were to instead use void subfunctions as:

void do_something(double *arr_in, ...) {
   for (...) {
      arr_in = do that something;
   }
   return;
}

update 1: I ran valgrind --leak-check=full on the executable and it appears there were no memory leaks using the first method. However, the executable links to a library which contains all the subroutines I made with this form, so it might not catch leaks from the library.

I'm curious as to how I would write the wrappers to free the memory and what the ** really does, or what a pointer to a pointer is, so I'm avoiding using the ** route (that and maybe I did it wrong because it didn't compile on my mac).

Here's one current subroutine:

double *cos_taper(double *arr_in, int npts)
{
int i;
double *arr_out;
double cos_taper[npts];
int M; 
M = floor( ((npts - 2) / 10) + .5);

arr_out = malloc(npts*sizeof(arr_out));

for (i=0; i<npts; i++) {
    if (i<M) {
        cos_taper[i] = .5 * (1-cos(i * PI / (M + 1)));
    }
    else if (i<npts - M - 2) {
        cos_taper[i] = 1;
    }
    else if (i<npts) {
        cos_taper[i] = .5 * (1-cos((npts - i - 1) * PI / (M + 1)));
    }
    arr_out[i] = arr_in[i] * cos_taper[i];
}
return arr_out;
}

From the advice I've gotten here, it sounds like a better method would be:

void *cos_taper(double *arr_in, double *arr_out, int npts)
{
int i;
double cos_taper[npts];
int M; 
M = floor( ((npts - 2) / 10) + .5);

for (i=0; i<npts; i++) {
    if (i<M) {
        cos_taper[i] = .5 * (1-cos(i * PI / (M + 1)));
    }
    else if (i<npts - M - 2) {
        cos_taper[i] = 1;
    }
    else if (i<npts) {
        cos_taper[i] = .5 * (1-cos((npts - i - 1) * PI / (M + 1)));
    }
    arr_out[i] = arr_in[i] * cos_taper[i];
}
return
}

call:

int main() {
  int npts;
  double *data, *cos_tapered;

  data = malloc(sizeof(data) * npts);
  cos_tapered = malloc(sizeof(cos_tapered) * npts);

//fill data

  cos_taper(data, cos_tapered, npts);
  free(data);
  ...
  free(cos_tapered);
  ...
  return 0;
}

ergosys · Accepted Answer

The malloc can be expensive relative to the processing you are doing, depending on what it is. Rather than restrict yourself to in-place processing, just use two parameters, in and out, and leave allocation to the caller. This gives the caller the option to reuse memory without allocating a new array for each call.

Carl Norum · Answer

If you can do your operation in place, doing so will probably help prevent bugs (at least memory related ones) and will be faster by at least the time taken to do the malloc() operation. The actual return type of your function probably doesn't affect the speed in any way.

Scott Smith · Answer

The returning of the double itself doesn't cost you much in terms of execution time.

Much more significant is the allocation of memory each time you come into the function. If you can pre-allocate, or store the result in place as you suggested, that should greatly improve the speed.

Another thing to consider is whether you actually need all of the precision that a double provides (vs. a float type). In many cases, floats are much faster.

rampion · Answer

I'd opt for letting the caller allocate the memory if they want to, but also be able to choose to have the operation done in place, or to have you do the allocation.

For operations that can't be done in place, you can manually check if the caller has given you the same input and output locations, and make a copy of the input yourself. Then process using that copy as input. This makes it look in place to the function caller.

For example, suppose you want to create a function that takes an shuffles an array of indexes such that output[i] == input[ input[i] ] (a silly function, true, but one that's nontrivial to do in place).

#include <stdlib.h> 
#include <string.h>
int shuffle(size_t const * input, size_t const size, size_t ** p_output)
{
    int retval = 0;
    size_t i;
    char in_place = 0;
    char cleanup_output = 0;

    if (size == 0)
    {
        return 0; // nothing to do
    }
    // make sure we can read our input and write our output
    else if (input == NULL || p_output == NULL)
    {
        return -2; // illegal input
    }
    // allocate memory for the output
    else if (*p_output == NULL)
    {
        *p_output = malloc(size * sizeof(size_t));
        if (*p_output == NULL) return -1; // memory allocation problem
        cleanup_output = 1; // free this memory if we run into errors
    }
    // use a copy of our input, since the algorithm doesn't operate in place.
    // and the input and output overlap at least partially
    else if (*p_output - size < input && input < *p_output + size)
    {
        size_t * const input_copy = malloc(size * sizeof(size_t));
        if (input_copy == NULL) return -1; // memory allocation problem
        memcpy( input_copy, input, size * sizeof(size_t));
        input = input_copy;
        in_place = 1;
    }

    // shuffle
    for (i = 0; i < size; i++)
    {
        if (input[i] >= size)
        {
            retval = -2; // illegal input
            break;
        }
        (*p_output)[i] = input[ input[i] ];
    }

    // cleanup
    if (in_place)
    {
         free((size_t *) input);
    }
    if (retval != 0 && cleanup_output)
    {
         free(*p_output);
         *p_output = NULL;
    }

    return retval;
}

This makes your function more robust - the function caller can allocate memory for the output (if they want to keep the input around), or have the output appear in the same place as the input, or have you allocate the memory for the output. This is especially nice if they got the input and output locations from somewhere else themselves, and aren't sure whether they're distinct. They don't have to know anything about the workings of the function.

// caller allocated memory
my_allocated_mem = malloc( my_array_size * sizeof(size_t) );
if(my_allocated_mem == NULL) { /*... */ }
shuffle( my_array, my_array_size, &my_allocated_mem );

// function allocated memory
my_allocated_mem = NULL;
shuffle( my_array, my_array_size, &my_allocated_mem );

// in place calculation
shuffle( my_array, my_array_size, &my_array);

// (naughty user isn't checking the function for error values, but you get the idea...)

You can see a full example of use here.

Since C doesn't have exceptions, it's fairly standard to use the return value of a function to report errors, and pass calculated values back via function pointer.

C: Returning a void versus returning a double * from a subfunction

Tags:

performance

c

memory

return

void

Rob Porritt

4 Answers

ergosys

Carl Norum

Scott Smith

rampion

Recent Activity

Donate For Us

C: Returning a void versus returning a double * from a subfunction

Tags:

performance

c

memory

return

void

Rob Porritt

4 Answers

ergosys

Carl Norum

Scott Smith

rampion

Related questions

Recent Activity

Donate For Us