Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Does String Conversion Between PyUnicode String and C String Work? [closed]

I have a PyUnicode object I'm trying to convert back to a C string (char *).

The way I am trying to do it does not seem to be working. Here is my code:

PyObject * objectCompName = PyTuple_GET_ITEM(compTuple, (Py_ssize_t) 0);
PyObject * ooCompName = PyUnicode_AsASCIIString(objectCompName);
char * compName = PyBytes_AsString(ooCompName);
Py_DECREF(ooCompName);

Is there another/better way I should be doing this?

like image 739
ComputerLocus Avatar asked Mar 18 '16 19:03

ComputerLocus


2 Answers

You need to first convert your python PyUnicode to a non-unicode python string (read more here: https://docs.python.org/2/c-api/unicode.html#ascii-codecs) , then you can easily convert the result into char* .

Below is a pseudo code to help you proceed:

// Assumption: you have a variable named "pyobj" which is
// a pointer to an instance of PyUnicodeObject.

PyObject* temp = PyUnicode_AsASCIIString(pyobj);
if (NULL == temp) {
    // Means the string can't be converted to ASCII, the codec failed
    printf("Oh noes\n");
    return;
}

// Get the actual bytes as a C string
char* c_str = PyByteArray_AsString(temp);

// Use the string in some manner
printf("The python unicode string is: %s\n", c_str);

// Make sure the temp stuff gets cleaned up at the end
Py_XDECREF(temp);
like image 30
user 12321 Avatar answered Sep 19 '22 03:09

user 12321


If UTF-8 encoded char * is OK, you should definitely use PyUnicode_AsUTF8AndSize (which requires Python 3.3):

PyObject * objectCompName = PySequence_GetItem(compTuple, 0);
if (! objectCompName) {
    return NULL;
}

Py_ssize_t size;
char *ptr = PyUnicode_AsUTF8AndSize(objectCompName, &size);
if (!ptr) {
    return NULL;
}

// notice that the string pointed to by ptr is not guaranteed to stay forever,
// and you need to copy it, perhaps by `strdup`.

Also, do understand that is mandatory to check the return value of each and every Py* function call that you ever execute in your code.

Here the PyTuple_GetItem will return NULL if compTuple is not a tuple, or 0 causes IndexError. PyUnicode_AsUTF8AndSize will return NULL if objectCompName is not a str object. Ignore the return value and CPython crashes with SIGSEGV when the conditions are right.

like image 172


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!