Cython: Create memoryview without NumPy array?

Tags:

Since I found memory-views handy and fast, I try to avoid creating NumPy arrays in cython and work with the views of the given arrays. However, sometimes it cannot be avoided, not to alter an existing array but create a new one. In upper functions this is not noticeable, but in often called subroutines it is. Consider the following function

#@cython.profile(False)
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
cdef double [:] vec_eq(double [:] v1, int [:] v2, int cond):
    ''' Function output corresponds to v1[v2 == cond]'''
    cdef unsigned int n = v1.shape[0]
    cdef unsigned int n_ = 0
    # Size of array to create
    cdef size_t i
    for i in range(n):
        if v2[i] == cond:
            n_ += 1
    # Create array for selection
    cdef double [:] s = np.empty(n_, dtype=np_float) # Slow line
    # Copy selection to new array
    n_ = 0
    for i in range(n):
        if v2[i] == cond:
            s[n_] = v1[i]
            n_ += 1
    return s

Profiling tells me, there is some speed to gain here

What I could do is adapting the function, cause sometimes, for instance the mean of this vector is calculated, sometimes the sum. So I could rewrite it, for summing or taking average. But isn't there a way to create memory-view with very little overhead directly, defining size dynamically. Something like first creating a c buffer using malloc etc and at the end of the function convert the buffer to a view, passing the pointer and strides or so..

Edit 1: Maybe for simple cases, adapting the function e. g. like this is an acceptable approach. I only added an argument and summing/taking average. This way I dont have to create an array and can take an easy to handle inside function malloc. This won't get any faster, will it?

# ...
cdef double vec_eq(double [:] v1, int [:] v2, int cond, opt=0):
    # additional option argument
    ''' Function output corresponds to v1[v2 == cond].sum() / .mean()'''
    cdef unsigned int n = v1.shape[0]
    cdef int n_ = 0
    # Size of array to create
    cdef Py_ssize_t i
    for i in prange(n, nogil=True):
        if v2[i] == cond:
            n_ += 1
    # Create array for selection
    cdef double s = 0
    cdef double * v3 = <double *> malloc(sizeof(double) * n_)
    if v3 == NULL:
        abort()
    # Copy selection to new array
    n_ = 0
    for i in range(n):
        if v2[i] == cond:
            v3[n_] = v1[i]
            n_ += 1
    # Do further computation here, according to option
    # Option 0 for the sum
    if opt == 0:
        for i in prange(n_, nogil=True):
            s += v3[i]
        free(v3)
        return s
    # Option 1 for the mean
    else:
        for i in prange(n_, nogil=True):
            s += v3[i]
        free(v3)
        return s / n_
    # Since in the end there is always only a single double value, 
    # the memory can be freed right here

978

asked Jan 08 '14 20:01

embert

1 Answers

Didn't know, how to deal with cpython arrays, so I solved this finally by a self made 'memory view', as proposed by fabrizioM. Wouldn't have thought that this would work. Creating a new np.array in a tight loop is pretty expensive, so this gave me a significant speed up. Since I only need a 1 dimensional array, I didn't even had to bother with strides. But even for a higher dimensional arrays, I think this could go well.

cdef class Vector:
    cdef double *data
    cdef public int n_ax0

    def __init__(Vector self, int n_ax0):
        self.data = <double*> malloc (sizeof(double) * n_ax0)
        self.n_ax0 = n_ax0

    def __dealloc__(Vector self):
        free(self.data)

...
#@cython.profile(False)
@cython.boundscheck(False)
cdef Vector my_vec_func(double [:, ::1] a, int [:] v, int cond, int opt):
    # function returning a Vector, which can be hopefully freed by del Vector
    cdef int vecsize
    cdef size_t i
    # defs..
    # more stuff...
    vecsize = n
    cdef Vector v = Vector(vecsize)

    for i in range(vecsize):
        # computation
        v[i] = ...

    return v

...
vec = my_vec_func(...
ptr_to_data = vec.data
length_of_vec = vec.n_ax0

182

answered Oct 09 '22 18:10

embert

Related questions
                            
                                How to run localtunnel v2 properly
                            
                                Is the Visitor pattern useful for dynamically typed languages?
                            
                                How to make CMake execute some script after it generates visual studio solution
                            
                                Disabling nose coverage report to STDOUT when HTML report is enabled?
                            
                                Using Parent Directories in Manifest.in
                            
                                How to indent attributes in when prettyprinting xml in python?
                            
                                Is there a plugin for sublime text 2 for debugging python?
                            
                                Documenting a non-existing member with Doxygen
                            
                                OSError: [Errno 11] Resource temporarily unavailable. What causes this?
                            
                                How to properly package flask application with static files
                            
                                Issue with querying Teradata in Python/Pyodbc
                            
                                Bad status line exception while opening a page on django development server
                            
                                PyTables dealing with data with size many times larger than size of memory
                            
                                Numpy Matrix Subtraction Confusion
                            
                                Select rows in a Numpy 2D array with a boolean vector
                            
                                Python Flask cross site HTTP POST - doesn't work for specific allowed origins
                            
                                Is PyBluez alive?
                            
                                What happens if you mix old and new style classes in a hierarchy?
                            
                                Looking for recommendation on how to convert PDF into structured format
                            
                                python cgitb is not functioning through a browser

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cython: Create memoryview without NumPy array?

Tags:

python

arrays

numpy

cython

memoryview

embert

People also ask

1 Answers

embert

Recent Activity

Donate For Us