Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Define array of strings in Cython

Tags:

cython

Stuck on some basic Cython here - what's a canonical and efficient way to define an an array of strings in Cython? Specifically, I want to define a fixed-length constant array of char. (Please note that I would prefer not to bring in NumPy at this point.)

In C this would be:

/* cletters.c */
#include <stdio.h>

int main(void)
{
    const char *headers[3] = {"to", "from", "sender"};
    int i;
    for (i = 0; i < 3; i++)
        printf("%s\n", headers[i]);
}

Attempt in Cython:

# cython: language_level=3
# letters.pyx

cpdef main():
    cdef const char *headers[3] = {"to", "from", "sender"}
    print(headers)

However, this gives:

(cy) $ python3 ./setup.py build_ext --inplace --quiet
cpdef main():
    cdef const char *headers[3] = {"to", "from", "sender"}
                               ^
------------------------------------------------------------

letters.pyx:5:32: Syntax error in C variable declaration
like image 760
Brad Solomon Avatar asked Dec 01 '18 16:12

Brad Solomon


People also ask

How do you declare a string in Cython?

String literals Cython understands all Python string type prefixes: b'bytes' for byte strings. u'text' for Unicode strings. f'formatted {value}' for formatted Unicode string literals as defined by PEP 498 (added in Cython 0.24)


2 Answers

You need two lines:

%%cython
cpdef main():
    cdef const char *headers[3] 
    headers[:] = ['to','from','sender`]       
    print(headers)

Somewhat counterintuitive is than one assigns unicode-strings (Python3!) to char*. That is one of Cython's quirks. On the other hand, while initializing everything with only one value, bytes-object is needed:

%%cython
cpdef main():
    cdef const char *headers[3] 
    headers[:] = b'init_value`  ## unicode-string 'init_value' doesn't work.     
    print(headers)

Another alternative is the following oneliner:

%%cython
cpdef main():
    cdef const char **headers=['to','from','sender`]

    print(headers[0], headers[1], headers[2])

which is not exactly the same as above and leads to the following C-code:

  char const **__pyx_v_headers;
  ...
  char const *__pyx_t_1[3];
  ...
  __pyx_t_1[0] = ((char const *)"to");
  __pyx_t_1[1] = ((char const *)"from");
  __pyx_t_1[2] = ((char const *)"sender");
  __pyx_v_headers = __pyx_t_1;

__pyx_v_headers is of type char ** and downside is, that print(headers)no longer works out of the box.

like image 70
ead Avatar answered Dec 10 '22 14:12

ead


For python3 Unicode strings, this is possible-

cdef Py_UNICODE* x[2] 
x = ["hello", "worlᏪd"]

or

cdef Py_UNICODE** x
x = ["hello", "worlᏪd"]
like image 37
Dev Aggarwal Avatar answered Dec 10 '22 15:12

Dev Aggarwal