I have a pandas series features
that has the following values (features.values
)
array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
array([0, 0, 0, ..., 0, 0, 0]), ...,
array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
array([0, 0, 0, ..., 0, 0, 0])], dtype=object)
Now I really want this to be recognized as matrix, but if I do
>>> features.values.shape
(10000,)
rather than (10000, 3000)
which is what I would expect.
How can I get this to be recognized as 2d rather than a 1d array with arrays as values. Also why does it not automatically detect it as a 2d array?
Use reshape() Function to Transform 1d Array to 2d Array The number of components within every dimension defines the form of the array. We may add or delete parameters or adjust the number of items within every dimension by using reshaping. To modify the layout of a NumPy ndarray, we will be using the reshape() method.
You can flatten a NumPy array ndarray with the numpy. label() function, or the ravel() and flatten() methods of numpy.
In response your comment question, let's compare 2 ways of creating an array
First make an array from a list of arrays (all same length):
In [302]: arr = np.array([np.arange(3), np.arange(1,4), np.arange(10,13)])
In [303]: arr
Out[303]:
array([[ 0, 1, 2],
[ 1, 2, 3],
[10, 11, 12]])
The result is a 2d array of numbers.
If instead we make an object dtype array, and fill it with arrays:
In [304]: arr = np.empty(3,object)
In [305]: arr[:] = [np.arange(3), np.arange(1,4), np.arange(10,13)]
In [306]: arr
Out[306]:
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
dtype=object)
Notice that this display is like yours. This is, by design a 1d array. Like a list it contains pointers to arrays elsewhere in memory. Notice that it requires an extra construction step. The default behavior of np.array
is to create a multidimensional array where it can.
It takes extra effort to get around that. Likewise it takes some extra effort to undo that - to create the 2d numeric array.
Simply calling np.array
on it does not change the structure.
In [307]: np.array(arr)
Out[307]:
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
dtype=object)
stack
does change it to 2d. stack
treats it as a list of arrays, which it joins on a new axis.
In [308]: np.stack(arr)
Out[308]:
array([[ 0, 1, 2],
[ 1, 2, 3],
[10, 11, 12]])
Shortening @hpauli answer:
your_2d_arry = np.stack(arr_of_arr_object)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With