Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing C extension functions for PostgreSQL - how do I do this? (passing data between C/PostgreSQL)

Tags:

c

postgresql

I am writing a C extension for PostgreSQL (v 8.4). I am currently stuck on how to pass columnar data from PostgreSQL to my C functions. I also have a question of memory ownership, as PostgreSQL seems to do a lot of memory management beind the scenes.

I would be grateful if someone could help me "join the dots", in order to get a basic skeleton code base upon which I could build the library from.

This is what I have so far:

/*******************************************************/
/*                  C header file                      */
/*******************************************************/
typedef struct _myarray
{
    double *data;
    size_t len;
} MyArray;


MyArray * NEW_MyArray(const size_t len);
void Destroy_MyArray(MyArray * arr);
size_t NumElements_MyArray(MyArray * arr);      /* trivial function returns number of elements */
MyArray * NotTrivial_MyArray(MyArray * arr);    /* non trivial function returns MyArray (a float8[] in PG) */
double HeapFunc_MyArray(MyArray * arr);         /* allocs from heap */


/*******************************************************/
/*                   C Source file                     */
/*******************************************************/

/* utility conversion funcs */
/* How do I convert from the structure returned by array_agg to float8[] (or int4[] ?) */


MyArray * NEW_MyArray(const size_t len){
    /* Do I use palloc0() or calloc() here ? */
}

void Destroy_MyArray(MyArray * arr){
    /* Do I use pfree() or free() here ? */
}

size_t NumElements_MyArray(MyArray * arr){
    assert(arr != 0);
    return arr->len;
}

MyArray * NotTrivial_MyArray(MyArray * arr){
    assert(arr != 0);
    MyArray * ptr = NEW_MyArray(arr->len);
    return ptr;
}

double HeapFunc_MyArray(MyArray * arr){
    /* Create temporary variables on heap (use palloc0() or calloc()?) */
    /* Cleanup temp variables (use pfree() or free() ? */
    return 42/1.0;
}



/*******************************************************/
/* PostgreSQL wrapper funcs implementation source file */
/*******************************************************/

/* Prototypes */
PG_FUNCTION_INFO_V1(test_num_elements);
PG_FUNCTION_INFO_V1(test_not_trivial);
PG_FUNCTION_INFO_V1(test_heapfunc);

Datum test_num_elements(PG_FUNCTION_ARGS);
Datum test_not_trivial(PG_FUNCTION_ARGS);
Datum test_heapfunc(PG_FUNCTION_ARGS);


Datum
test_num_elements(PG_FUNCTION_ARGS)
{
    /* Convert data returned by array_agg() into MyArray * (how?) */
    /* invoke NumElements_MyArray() */
    /* Do I free temporary MyArray * ptr or will PG clean up 
       - if I have to clean up (like I suspect), do I use pfree() or free() ?*/
    PG_RETURN_INT32(result);
}

Datum
test_not_trivial(PG_FUNCTION_ARGS)
{
    /* Ditto, as above */
    PG_RETURN_POINTER(/* utility function to convert MyArray* to float8[] equiv for PG (how) */); 
}

Datum
test_heapfunc(PG_FUNCTION_ARGS)
{
    /* Ditto, as above */
    PG_RETURN_FLOAT8(result);
}


-- SQL FUNCTIONS

CREATE OR REPLACE FUNCTION test_num_elements(float8[])  RETURNS int4
AS '$libdir/pg_testlib.so' LANGUAGE 'c';

CREATE OR REPLACE FUNCTION test_not_trivial(float8[])  RETURNS float8[]
AS '$libdir/pg_testlib.so' LANGUAGE 'c';

CREATE OR REPLACE FUNCTION test_heapfunc(float8[])  RETURNS float8
AS '$libdir/pg_testlib.so' LANGUAGE 'c';


-- SQL TEST
SELECT test_num_elements(array_agg(salary)) FROM employees;
SELECT test_not_trivial(array_agg(salary)) FROM employees;
SELECT test_heapfunc(array_agg(salary)) FROM employees;

In summary, my questions are:

  1. How do I convert columnar data from array_agg() into a C array of doubles (or ints as the case may be)
  2. How do I convert a C array of ints (or doubles) back into int4[] or float8[] for consumption in PostgreSQL?
  3. Memory allocation principles - do I use the PostgreSQL memory management functions palloc()/ pfree() or can I use calloc/free?. Also, when using the PG mem funcs, am I responsible for freeing memory I allocated?
like image 231
Homunculus Reticulli Avatar asked Jan 05 '12 09:01

Homunculus Reticulli


1 Answers

First, you should always use palloc/pfree to manage memory. PostgreSQL will not manage memory for you, but each connection is handled by an independent process, so if you do leak memory, it will only last the life of a connection (granted, that can be quite long).

Normally, one allocates data as:

void *data = palloc(size_of_your_data + VARHDRSZ);
SET_VARSIZE(data, size_of_your_data + VARHDRSZ);

and then access your data using:

MyArray *myarr = (MyArray*) VARDATA(data);

Once you are finished, you can:

PG_RETURN_POINTER(data);

If you return data, you do not free it. If you wish to palloc temporary storage, you must pfree it.

Now, MyArray is not what you want to have a float8[]. You need to use an ArrayType, as described in 34.9.11. Polymorphic Arguments and Return Types.

like image 134
apmasell Avatar answered Sep 30 '22 13:09

apmasell