Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtain the size of an integer at compile-time in Cython

Tags:

cython

Is it possible, and if yes how, to determine the size, in bits, of the integer data types in Cython?

I'm trying to do something like this, to obtain the integers sizes:

cdef WORD_BITS = 0
IF sizeof(unsigned long long) == 8:
    WORD_BITS = 64
    DEF VECTOR_LENGTH_SHIFT_AMOUNT = 6
ELSE:
    WORD_BITS = 32
    DEF VECTOR_LENGTH_SHIFT_AMOUNT = 5

ctypedef unsigned long long word_t

cdef int vector_length(size_t bit_size):

    cdef size_t size = bit_size >> VECTOR_LENGTH_SHIFT_AMOUNT
    if size << VECTOR_LENGTH_SHIFT_AMOUNT < bit_size:
        size += 1
    return size

cdef class BitVector(object):

    cdef size_t length
    cdef size_t array_size
    cdef word_t *array

    def __cinit__(self, size_t size):
        self.length = size
        self.array_size = vector_length(size)
        self.array = <word_t *>calloc(self.array_size, sizeof(word_t))

    def __dealloc__(self):
        free(self.array)

I need to handle both the single bits of the elements of the array and the elements themselves, and thus I have to know how many bits they contain(to compute the proper masks/shifts). Trying to compile code like the above yields:

$python setup.py build_ext --inplace
Compiling bitvector.pyx because it changed.
Cythonizing bitvector.pyx

Error compiling Cython file:
------------------------------------------------------------
...
cimport cython


# check whether we are running on a 64 or 32 bit architecture.
cdef WORD_BITS = 0
IF sizeof(unsigned long long) == 8:
  ^
------------------------------------------------------------

bitvector.pyx:7:3: Invalid compile-time expression

Traceback (most recent call last):
  File "setup.py", line 6, in <module>
    ext_modules=cythonize('bitvector.pyx')
  File "/usr/lib/python2.7/dist-packages/Cython/Build/Dependencies.py", line 673, in cythonize
    cythonize_one(*args[1:])
  File "/usr/lib/python2.7/dist-packages/Cython/Build/Dependencies.py", line 737, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: bitvector.pyx

Is there a working alternative?

I know that there is a stdint.h header that should define the integer types, but I cannot think of a way to use it since:

  • I don't know how to check if a type is not defined(e.g. how do you write IF uint64_t is not defined: in cython?).
  • Cython's documentation states that only things defined by DEF and the compiler can be checked in IFs, thus I doubt that I would be able to use stdint.h anyway.

It seems like this is not feasible in Cython since the check I want to make can only be performed when compiling from C to machine code, and not from cython to C.

Now I wonder: is it possible to write a cython extension in such a way that this kind of check is added in the C source code?

I mean, can I somehow write:

cdef WORD_BITS = 0
IF sizeof(unsigned long long) == 8:
    WORD_BITS = 64
    DEF VECTOR_LENGTH_SHIFT_AMOUNT = 6
ELSE:
    WORD_BITS = 32
    DEF VECTOR_LENGTH_SHIFT_AMOUNT = 5

ctypedef unsigned long long word_t

In such a way that this IF "isn't processed" by Cython, but it is passed through and in the final C file there is the equivalent code?

like image 254
Bakuriu Avatar asked Apr 02 '13 13:04

Bakuriu


1 Answers

Instead of using the preprocessor to define the size and shift values I would change your vector_length function slightly so that it can use sizeof directly. Cython will translate the sizeof operator correctly and the compiler will substitute the actual size of the type at compile-time. See this section from the glibc documentation for more information on using sizeof and CHAR_BIT to obtain the correct vector size: https://www.gnu.org/software/libc/manual/html_node/Width-of-Type.html.

from libc.stdlib cimport calloc, free
from libc.limits cimport CHAR_BIT

ctypedef unsigned long long word_t

cdef size_t vector_length(size_t bit_size):
    cdef size_t bits_per_word = CHAR_BIT*sizeof(word_t)
    return (bit_size + bits_per_word - 1) / bits_per_word

cdef class BitVector(object):
    cdef size_t length
    cdef size_t array_size
    cdef word_t *array

    def __cinit__(self, size_t size):
        self.length = size
        self.array_size = vector_length(size)
        self.array = <word_t *>calloc(self.array_size, sizeof(word_t))

    def __dealloc__(self):
        free(self.array)

It is also worth noting that unsigned long long is at least 64-bits (https://en.wikipedia.org/wiki/C_data_types). Your code seems to assume that it can be either 64 or 32-bits but in a standards-compliant compiler it will only ever be 64-bits or more.

like image 79
Jon Lund Steffensen Avatar answered Oct 27 '22 07:10

Jon Lund Steffensen