Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python jagged array operation efficiency

I am new to Python and I am looking for the most efficient way to do operations with a jagged array.

I have a jagged array like this:

A = array([[array([1, 2, 3]), array([4, 5])],[array([6, 7, 8, 9]), array([10])]], dtype=object)

I want to be able to do things like this:

A=A[A>4]
B=A+A

Apparently python is very efficient for doing operations like this with numpy arrays, but unfortunetely I need to do this for jagged arrays and I havent found such an object in Python. Does it exist in Python, or is there a library that allows to do efficient operations with jagged arrays ?

For the example I gave, here are the outputs I'd like:

A = array([[array([]), array([5])],[array([6, 7, 8, 9]), array([10])]], dtype=object)
B = array([[array([]), array([10])],[array([12, 14, 16, 18]), array([20])]], dtype=object)

But maybe the way Python works it simply cannot do efficient operations with jagged arrays like it does with numpy arrays, I dont know the details.

like image 665
JoVe Avatar asked May 04 '26 20:05

JoVe


1 Answers

Your array is 2x2:

In [298]: A
Out[298]: 
array([[array([1, 2, 3]), array([4, 5])],
       [array([6, 7, 8, 9]), array([10])]], dtype=object)

While A+A works, boolean tests have not been implemented for this kind of array:

In [299]: A>4
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I'm going to flatten A because it makes it easier to compare with list operations:

In [301]: A1=A.flatten()

In [303]: A1+A1
Out[303]: 
array([array([2, 4, 6]), array([ 8, 10]), array([12, 14, 16, 18]),
       array([20])], dtype=object)

In [304]: [a+a for a in A1]
Out[304]: [array([2, 4, 6]), array([ 8, 10]), array([12, 14, 16, 18]), array([20])]

In [305]: timeit A1+A1
100000 loops, best of 3: 6.85 µs per loop

In [306]: timeit [a+a for a in A1]
100000 loops, best of 3: 9.09 µs per loop

The array operation is a bit faster than a list comprehension. But if I first turn the array into a list:

In [307]: A1l=A1.tolist()

In [308]: A1l
Out[308]: [array([1, 2, 3]), array([4, 5]), array([6, 7, 8, 9]), array([10])]

In [309]: timeit [a+a for a in A1l]
100000 loops, best of 3: 5.2 µs per loop

times improve. This is a good indication that the A1+A1 (or even A+A) is using a similar sort of iteration.

So the straight forward way of performing your A,B calculation is

In [310]: A2=[a[a>4] for a in A1]
In [311]: B=[a+a for a in A2]
In [312]: B
Out[312]: [array([], dtype=int32), array([10]), array([12, 14, 16, 18]), array([20])]

(we can convert to/from arrays and lists as needed).

A numpy array stores its data a flat databuffer, and uses the shape and strides attributes to quickly calculate the location of any element, regardless of the dimensions. The fast array operations use compiled code that rapidly steps though the databuffers of arguments, performing the operations element by element (or some other combination).

A dtype object array also has the flat databuffer, but the elements are pointers to lists or arrays elsewhere. So while it can index individual elements quickly, it still has to perform a Python call(s) to access the arrays. So especially when the array is 1d, it is virtually the same as a flat list with the same pointers.

Multidimensional object arrays are nicer than nested lists. You can reshape them, access elements (A[1,3] v Al[1][3]), transpose them, etc. But when it comes to iterating through all the subarrays they don't offer much of a benefit.

Looking again at your 2d array:

In [315]: timeit A+A
100000 loops, best of 3: 6.93 µs per loop  # 6.85 for A1+A1 (above)

In [316]: timeit [[j+j for j in i] for i in A]
100000 loops, best of 3: 17.1 µs per loop

In [317]: Al = A.tolist()

In [318]: timeit [[j+j for j in i] for i in Al]
100000 loops, best of 3: 7.01 µs per loop    # 5.2 for A1l flat list

Basically the same time for summing the array and iterating through the equivalent nested list.

like image 147
hpaulj Avatar answered May 06 '26 09:05

hpaulj



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!