I have a pandas series <code>features</code> that has the following values (<code>features.values</code>) <pre class="prettyprint"><code>array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]), ..., array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0])], dtype=object) </code></pre> Now I really want this to be recognized as matrix, but if I do <pre class="prettyprint"><code>>>> features.values.shape (10000,) </code></pre> rather than <code>(10000, 3000)</code> which is what I would expect. How can I get this to be recognized as 2d rather than a 1d array with arrays as values. Also why does it not automatically detect it as a 2d array?

Shortening @hpauli answer: <pre class="prettyprint"><code>your_2d_arry = np.stack(arr_of_arr_object) </code></pre>

converty numpy array of arrays to 2d array

Tags:

numpy

I have a pandas series features that has the following values (features.values)

array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0]), ...,
       array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0])], dtype=object)

Now I really want this to be recognized as matrix, but if I do

>>> features.values.shape
(10000,)

rather than (10000, 3000) which is what I would expect.

How can I get this to be recognized as 2d rather than a 1d array with arrays as values. Also why does it not automatically detect it as a 2d array?

592

asked Jun 21 '18 14:06

2 Answers

In response your comment question, let's compare 2 ways of creating an array

First make an array from a list of arrays (all same length):

In [302]: arr = np.array([np.arange(3), np.arange(1,4), np.arange(10,13)])
In [303]: arr
Out[303]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

The result is a 2d array of numbers.

If instead we make an object dtype array, and fill it with arrays:

In [304]: arr = np.empty(3,object)
In [305]: arr[:] = [np.arange(3), np.arange(1,4), np.arange(10,13)]
In [306]: arr
Out[306]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

Notice that this display is like yours. This is, by design a 1d array. Like a list it contains pointers to arrays elsewhere in memory. Notice that it requires an extra construction step. The default behavior of np.array is to create a multidimensional array where it can.

It takes extra effort to get around that. Likewise it takes some extra effort to undo that - to create the 2d numeric array.

Simply calling np.array on it does not change the structure.

In [307]: np.array(arr)
Out[307]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

stack does change it to 2d. stack treats it as a list of arrays, which it joins on a new axis.

In [308]: np.stack(arr)
Out[308]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

answered Oct 07 '22 11:10

hpaulj

Shortening @hpauli answer:

your_2d_arry = np.stack(arr_of_arr_object)

answered Oct 07 '22 12:10

Shaida Muhammad

Related questions
                            
                                How to get rid of double backslash in python windows file path string? [duplicate]
                            
                                Python logging.DEBUG level doesn't logging
                            
                                ImportError: DLL load failed: %1 is not a valid Win32 application
                            
                                How can I rotate a matplotlib plot through 90 degrees?
                            
                                OS X - Deciding between anaconda and homebrew Python environments
                            
                                Anaconda: Install specific packages from specific channels using environment.yml
                            
                                Downsample array in Python
                            
                                Python requests.exception.ConnectionError: connection aborted "BadStatusLine"
                            
                                PIP Constraints Files
                            
                                How to run cloned Django project?
                            
                                Get list of Cache Keys in Django
                            
                                NumPy and SciPy - Difference between .todense() and .toarray()
                            
                                How to run a single line or selected code in a Jupyter Notebook or JupyterLab cell?
                            
                                Using absolute unix paths in windows with python
                            
                                Why isn't SQLAlchemy default column value available before object is committed?
                            
                                How to convert ndarray to array?
                            
                                functools.partial wants to use a positional argument as a keyword argument
                            
                                Python Asynchronous Comprehensions - how do they work?
                            
                                Create large random boolean matrix with numpy
                            
                                Improving the performance of pandas groupby

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

converty numpy array of arrays to 2d array

Tags:

python

multidimensional-array

pandas

numpy

Nate Stemen

People also ask

2 Answers

hpaulj

Shaida Muhammad

Recent Activity

Donate For Us