Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Does String Conversion Between PyUnicode String and C String Work? [closed]

I have a PyUnicode object I'm trying to convert back to a C string (char *).

The way I am trying to do it does not seem to be working. Here is my code:

PyObject * objectCompName = PyTuple_GET_ITEM(compTuple, (Py_ssize_t) 0);
PyObject * ooCompName = PyUnicode_AsASCIIString(objectCompName);
char * compName = PyBytes_AsString(ooCompName);
Py_DECREF(ooCompName);

Is there another/better way I should be doing this?

like image 739
ComputerLocus Avatar asked Mar 18 '16 19:03

ComputerLocus


2 Answers

You need to first convert your python PyUnicode to a non-unicode python string (read more here: https://docs.python.org/2/c-api/unicode.html#ascii-codecs) , then you can easily convert the result into char* .

Below is a pseudo code to help you proceed:

// Assumption: you have a variable named "pyobj" which is
// a pointer to an instance of PyUnicodeObject.

PyObject* temp = PyUnicode_AsASCIIString(pyobj);
if (NULL == temp) {
    // Means the string can't be converted to ASCII, the codec failed
    printf("Oh noes\n");
    return;
}

// Get the actual bytes as a C string
char* c_str = PyByteArray_AsString(temp);

// Use the string in some manner
printf("The python unicode string is: %s\n", c_str);

// Make sure the temp stuff gets cleaned up at the end
Py_XDECREF(temp);
like image 30
user 12321 Avatar answered Sep 19 '22 03:09

user 12321


If UTF-8 encoded char * is OK, you should definitely use PyUnicode_AsUTF8AndSize (which requires Python 3.3):

PyObject * objectCompName = PySequence_GetItem(compTuple, 0);
if (! objectCompName) {
    return NULL;
}

Py_ssize_t size;
char *ptr = PyUnicode_AsUTF8AndSize(objectCompName, &size);
if (!ptr) {
    return NULL;
}

// notice that the string pointed to by ptr is not guaranteed to stay forever,
// and you need to copy it, perhaps by `strdup`.

Also, do understand that is mandatory to check the return value of each and every Py* function call that you ever execute in your code.

Here the PyTuple_GetItem will return NULL if compTuple is not a tuple, or 0 causes IndexError. PyUnicode_AsUTF8AndSize will return NULL if objectCompName is not a str object. Ignore the return value and CPython crashes with SIGSEGV when the conditions are right.

like image 172