I have a large 2D array that I would like to declare once, and change occasionnaly only some values depending on a parameter, without traversing the whole array.
To build this array, I have subclassed the numpy ndarray class with dtype=object
and assign to the elements I want to change a function e.g. :
def f(parameter):
return parameter**2
for i in range(np.shape(A)[0]):
A[i,i]=f
for j in range(np.shape(A)[0]):
A[i,j]=1.
I have then overridden the __getitem__
method so that it returns the evaluation of the function with given parameter if it is callable, otherwise return the value itself.
def __getitem__(self, key):
value = super(numpy.ndarray, self).__getitem__(key)
if callable(value):
return value(*self.args)
else:
return value
where self.args
were previously given to the instance of myclass.
However, I need to work with float arrays at the end, and I can't simply convert this array into a dtype=float
array with this technique. I also tried to use numpy views, which does not work either for dtype=object
.
Do you have any better alternative ? Should I override the view method rather than getitem ?
Edit I will maybe have to use Cython in the future, so if you have a solution involving e.g. C pointers, I am interested.
WeldNumpy is a Weld-enabled library that provides a subclass of NumPy's ndarray module, called weldarray, which supports automatic parallelization, lazy evaluation, and various other optimizations for data science workloads.
numpy. array is just a convenience function to create an ndarray ; it is not a class itself. You can also create an array using numpy. ndarray , but it is not the recommended way.
The main data structure in NumPy is the ndarray, which is a shorthand name for N-dimensional array. When working with NumPy, data in an ndarray is simply referred to as an array. It is a fixed-sized array in memory that contains data of the same type, such as integers or floating point values.
ndarray. This array attribute returns the number of array dimensions.
In this case, it does not make sens to bind a transformation function, to every index of your array.
Instead, a more efficient approach would be to define a transformation, as a function, together with a subset of the array it applies to. Here is a basic implementation,
import numpy as np
class LazyEvaluation(object):
def __init__(self):
self.transforms = []
def add_transform(self, function, selection=slice(None), args={}):
self.transforms.append( (function, selection, args))
def __call__(self, x):
y = x.copy()
for function, selection, args in self.transforms:
y[selection] = function(y[selection], **args)
return y
that can be used as follows:
x = np.ones((6, 6))*2
le = LazyEvaluation()
le.add_transform(lambda x: 0, [[3], [0]]) # equivalent to x[3,0]
le.add_transform(lambda x: x**2, (slice(4), slice(4,6))) # equivalent to x[4,4:6]
le.add_transform(lambda x: -1, np.diag_indices(x.shape[0], x.ndim), ) # setting the diagonal
result = le(x)
print(result)
which prints,
array([[-1., 2., 2., 2., 4., 4.],
[ 2., -1., 2., 2., 4., 4.],
[ 2., 2., -1., 2., 4., 4.],
[ 0., 2., 2., -1., 4., 4.],
[ 2., 2., 2., 2., -1., 2.],
[ 2., 2., 2., 2., 2., -1.]])
This way you can easily support all advanced Numpy indexing (element by element access, slicing, fancy indexing etc.), while at the same time keeping your data in an array with a native data type (float
, int
, etc) which is much more efficient than using dtype='object'
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With