Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Force numpy to keep a list a list

Tags:

python

numpy

x2_Kaxs is an Nx3 numpy array of lists, and the elements in those lists index into another array. I want to end up with an Nx3 numpy array of lists of those indexed elements.

x2_Kcids = array([ ax2_cid[axs] for axs in x2_Kaxs.flat ], dtype=object)

This outputs a (N*3)x1 array of numpy arrays. great. that almost works for what I want. All I need to do is reshape it.

x2_Kcids.shape = x2_Kaxs.shape

And this works.x2_Kcids becomes an Nx3 array of numpy arrays. Perfect.

Except all the lists in x2_Kaxs only have one element in them. Then it flattens it into an Nx3 array of integers, and my code expects a list later in the pipeline.

One solution I came up with was to append a dummy element and then pop it off, but that is very ugly. Is there anything nicer?

like image 819
Erotemic Avatar asked Mar 14 '13 17:03

Erotemic


1 Answers

Your problem is not really about lists of size 1, it is about list all of the same size. I have created this dummy samples:

ax2_cid = np.random.rand(10)
shape = (10, 3)

x2_Kaxs = np.empty((10, 3), dtype=object).reshape(-1)
for j in xrange(x2_Kaxs.size):
    x2_Kaxs[j] = [random.randint(0, 9) for k in xrange(random.randint(1, 5))]
x2_Kaxs.shape = shape

x2_Kaxs_1 = np.empty((10, 3), dtype=object).reshape(-1)
for j in xrange(x2_Kaxs.size):
    x2_Kaxs_1[j] = [random.randint(0, 9)]
x2_Kaxs_1.shape = shape

x2_Kaxs_2 = np.empty((10, 3), dtype=object).reshape(-1)
for j in xrange(x2_Kaxs_2.size):
    x2_Kaxs_2[j] = [random.randint(0, 9) for k in xrange(2)]
x2_Kaxs_2.shape = shape

If we run your code on these three, the return has the following shapes:

>>> np.array([ax2_cid[axs] for axs in x2_Kaxs.flat], dtype=object).shape
(30,)
>>> np.array([ax2_cid[axs] for axs in x2_Kaxs_1.flat], dtype=object).shape
(30, 1)
>>> np.array([ax2_cid[axs] for axs in x2_Kaxs_2.flat], dtype=object).shape
(30, 2)

And the case with all lists of length 2 won't even let you reshape to (n, 3). The problem is that, even with dtype=object, numpy tries to numpify your input as much as possible, which is all the way down to individual elements if all lists are of the same length. I think that your best bet is to preallocate your x2_Kcids array:

x2_Kcids = np.empty_like(x2_Kaxs).reshape(-1)
shape = x2_Kaxs.shape
x2_Kcids[:] = [ax2_cid[axs] for axs in x2_Kaxs.flat]
x2_Kcids.shape = shape

EDIT Since unubtu's answer is no longer visible, I am going to steal from him. The code above can be much more nicely and compactly written as:

x2_Kcids = np.empty_like(x2_Kaxs)
x2_Kcids.ravel()[:] = [ax2_cid[axs] for axs in x2_Kaxs.flat]

With the above example of single item lists:

>>> x2_Kcids_1 = np.empty_like(x2_Kaxs_1).reshape(-1)
>>> x2_Kcids_1[:] = [ax2_cid[axs] for axs in x2_Kaxs_1.flat]
>>> x2_Kcids_1.shape = shape
>>> x2_Kcids_1
array([[[ 0.37685372], [ 0.95328117], [ 0.63840868]],
       [[ 0.43009678], [ 0.02069558], [ 0.32455781]],
       [[ 0.32455781], [ 0.37685372], [ 0.09777559]],
       [[ 0.09777559], [ 0.37685372], [ 0.32455781]],
       [[ 0.02069558], [ 0.02069558], [ 0.43009678]],
       [[ 0.32455781], [ 0.63840868], [ 0.37685372]],
       [[ 0.63840868], [ 0.43009678], [ 0.25532799]],
       [[ 0.02069558], [ 0.32455781], [ 0.09777559]],
       [[ 0.43009678], [ 0.37685372], [ 0.63840868]],
       [[ 0.02069558], [ 0.17876822], [ 0.17876822]]], dtype=object)
>>> x2_Kcids_1[0, 0]
array([ 0.37685372])
like image 165
Jaime Avatar answered Oct 11 '22 23:10

Jaime