Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Static pointer to dynamically allocated buffer inside function

Tags:

c

malloc

numpy

I have a function in C that dynamically allocates a buffer, which is passed to another function to store its return value. Something like the following dummy example:

void other_function(float in, float *out, int out_len) {
    /* Fills 'out' with 'out_len' values calculated from 'in' */
}

void function(float *data, int data_len, float *out) {
    float *buf;
    int buf_len = 2 * data_len, i;
    buf = malloc(sizeof(float) * buf_len);

    for (i = 0; i < data_len; i++, data++, out++) {
        other_function(*data, buf, buf_len);
        /* Do some other stuff with the contents of buf and write to *out */
    }
    free buf;
}

function is called by an iterator over a multi-dimensional array (it's a NumPy gufunc kernel, to be precise), so it gets called millions of times with the same value for data_len. It seems wasteful to be creating and destroying the buffer over and over again. I would normally move allocation of the buffer to the function that calls function, and pass a poiinter to it, but I don't directly control that, so not possible. Instead, I am considering doing the following:

void function(float *data, int data_len, float *out) {
    static float *buf = NULL;
    static int buf_len = 0;
    int i;
    if (buf_len != 2 * data_len) {
        buf_len = 2 * data_len;
        buf = realloc(buf, sizeof(float) * buf_len); /* same as malloc if buf == NULL */
    }
    for (i = 0; i < data_len; i++, data++, out++) {
        other_function(*data, buf, buf_len);
        /* Do some other stuff with the contents of buf and write to *out */
    }
}

That means that I never directly free the memory I allocate: it gets reused in subsequent calls, and then lingers there until my program exits. It doesn't seem like the right thing to do, but not too bad either, as the amount of memory allocated is always going to be small. Am I over worrying? Is there a better approach to this?

like image 608
Jaime Avatar asked Aug 22 '13 21:08

Jaime


People also ask

Are pointers dynamically allocated?

Dynamic memory allocation is to allocate memory at “run time”. Dynamically allocated memory must be referred to by pointers. the computer memory which can be accessed by the identifier (the name of the variable).

What is a pointer how dynamic memory is allocated in data structure?

In C, dynamic memory is allocated from the heap using some standard library functions. The two key dynamic memory functions are malloc() and free(). The malloc() function takes a single parameter, which is the size of the requested memory area in bytes. It returns a pointer to the allocated memory.

What happens when you free a pointer in C?

The function free takes a pointer as parameter and deallocates the memory region pointed to by that pointer. The memory region passed to free must be previously allocated with calloc , malloc or realloc . If the pointer is NULL , no action is taken.

How can I allocate more memory without malloc?

If you don't want to use malloc then you can create your own version of malloc function. When we call malloc function it just initiate a system call to the kernel for memory allocation. So, if want to allocate memory without malloc, if have initiate that system call(sbrk or nmap or brk) from your code.


2 Answers

This approach is legitimate (but see below), although tools like valgrind will incorrectly flag it as a "leak". (It's not a leak, as a leak is an unbounded increase in memory usage.) You might want to benchmark exactly how much time is lost on malloc and free compared to other things the function is doing.

If you can use C99 or gcc, and if your buffer is not overly large, you should also consider variable-length arrays, which are as fast (or faster than) a static buffer, and create no fragmentation. If you're on another compiler, you can look into the non-standard (but widely supported) alloca extension.

You do need to be aware that using a static buffer makes your function:

  1. Thread-unsafe - if it is called from multiple threads simultaneously, it will destroy the data of the other instance. If the Python is called from numpy, this is probably not a problem, as threads will be effectively serialized by the GIL.

  2. Non-reentrant - if other_function calls some Python code which ends up calling function - for whatever reason - before function finishes, your function will again destroy its own data.

If you don't need true parallel execution and reentrancy, this use of static variables is fine, and a lot of C code uses them that way.

like image 81
user4815162342 Avatar answered Oct 05 '22 00:10

user4815162342


This is a fine approach, and something like this is likely used internally by many libraries. The memory will be freed automatically when the program exits.

You might want to round buf_len up to a multiple of some block size, so you don't realloc() every time data_len changes a small bit. But if data_len is almost always the same size, this isn't necessary.

like image 40
Barmar Avatar answered Oct 05 '22 01:10

Barmar