Suppose I have a numpy array <code>c</code> constructed as follows: <pre class="prettyprint"><code>a = np.zeros((2,4)) b = np.zeros((2,8)) c = np.array([a,b]) </code></pre> I would have expected <code>c.shape</code> to be <code>(2,1)</code> or <code>(2,)</code> but instead it is <code>(2,2)</code>. Additionally, what I want to do is concatenate a column vector of ones onto <code>a</code>, but by accessing it through <code>c</code> in the following way: <pre class="prettyprint"><code>c0 = c[0] # I would have expected this to be 'a' np.concatenate((np.ones((c0.shape[0], 1)), c0), axis=1) </code></pre> This of course doesn't work because <code>c[0]</code> does not equal <code>a</code> as I expected, and I get <pre class="prettyprint"><code>ValueError: all the input arrays must have same number of dimensions </code></pre> I need some way to have an array (or list) of pairs, each pair component being a numpy array, and I need to access the first array in the pair in order to concatenate a column vector of ones to it. My application is machine learning and my data will be coming to me in the format described, but I need to modify the data at the start in order to add a bias element to it. EDIT: I'm using Python 2.7 and Numpy 1.8.2

Generally, nested NumPy arrays of NumPy arrays are not very useful. If you are using NumPy for speed, usually it is best to stick with NumPy arrays with a homogenous, basic numeric dtype. To place two items in a data structure such that <code>c[0]</code> returns the first item, and <code>c[1]</code> the second, a list (or tuple) such as <code>c = [a, b]</code> will do. <hr> By the way, if you are using the <code>statemodels</code> package, then you can add a constant column with <code>sm.add_constant</code>: <pre class="prettyprint"><code>import numpy as np import statsmodels.api as sm a = np.random.randint(10, size=(2,4)) print(a) # [[2 3 9 6] # [0 2 1 1]] print(sm.add_constant(a)) [[ 1. 2. 3. 9. 6.] [ 1. 0. 2. 1. 1.]] </code></pre> Note however that if <code>a</code> already contains a constant column, no extra column is added: <pre class="prettyprint"><code>In [126]: sm.add_constant(np.zeros((2,4))) Out[126]: array([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]) </code></pre>

I believe what you want to use is <code>hstack</code>: <pre class="prettyprint"><code>a = np.zeros((2,4)) # 4 column vectors of length 2 b = np.ones((2,1)) # 1 column vector of length 2 c = np.hstack((a, b)) print c # [[ 0. 0. 0. 0. 1.] # [ 0. 0. 0. 0. 1.]] </code></pre> Regarding the problem concatenating your <code>a</code> and <code>b</code>: This cannot be done in a obvious way. Concatenation means stacking on top of each other in an additional dimension. Your data does not fit on one another though...

nested Python numpy arrays dimension confusion

Tags:

python

arrays

numpy

Suppose I have a numpy array c constructed as follows:

a = np.zeros((2,4))
b = np.zeros((2,8))
c = np.array([a,b])

I would have expected c.shape to be (2,1) or (2,) but instead it is (2,2). Additionally, what I want to do is concatenate a column vector of ones onto a, but by accessing it through c in the following way:

c0 = c[0] # I would have expected this to be 'a'
np.concatenate((np.ones((c0.shape[0], 1)), c0), axis=1)

This of course doesn't work because c[0] does not equal a as I expected, and I get

ValueError: all the input arrays must have same number of dimensions

I need some way to have an array (or list) of pairs, each pair component being a numpy array, and I need to access the first array in the pair in order to concatenate a column vector of ones to it. My application is machine learning and my data will be coming to me in the format described, but I need to modify the data at the start in order to add a bias element to it.

EDIT: I'm using Python 2.7 and Numpy 1.8.2

200

asked Jul 18 '15 16:07

adamconkey

2 Answers

Generally, nested NumPy arrays of NumPy arrays are not very useful. If you are using NumPy for speed, usually it is best to stick with NumPy arrays with a homogenous, basic numeric dtype.

To place two items in a data structure such that c[0] returns the first item, and c[1] the second, a list (or tuple) such as c = [a, b] will do.

By the way, if you are using the statemodels package, then you can add a constant column with sm.add_constant:

import numpy as np
import statsmodels.api as sm

a = np.random.randint(10, size=(2,4))
print(a)
# [[2 3 9 6]
#  [0 2 1 1]]
print(sm.add_constant(a))
[[ 1.  2.  3.  9.  6.]
 [ 1.  0.  2.  1.  1.]]

Note however that if a already contains a constant column, no extra column is added:

In [126]: sm.add_constant(np.zeros((2,4)))
Out[126]: 
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

145

answered Oct 23 '22 01:10

unutbu

I believe what you want to use is hstack:

a = np.zeros((2,4))  # 4 column vectors of length 2
b = np.ones((2,1))   # 1 column vector of length 2

c = np.hstack((a, b))
print c
# [[ 0.  0.  0.  0.  1.]
#  [ 0.  0.  0.  0.  1.]]

Regarding the problem concatenating your a and b: This cannot be done in a obvious way. Concatenation means stacking on top of each other in an additional dimension. Your data does not fit on one another though...

answered Oct 23 '22 01:10

Dux

Related questions
                            
                                Repeat Pandas dataframe row labels
                            
                                "failed with error code 1" while installing scipy
                            
                                How can I create a PNG image file from a list of pixel values in Python?
                            
                                read specific line in csv file , python
                            
                                Reddit search API not giving all results
                            
                                Code style - for with if
                            
                                Crawling SSL site with scrapy
                            
                                mocking subprocess.Popen dependant on import style
                            
                                How to get ROS xml param from launch file using Python
                            
                                Webdriver: How to find elements when class name contains space?
                            
                                How to speed up python curve_fit over a 2D array?
                            
                                Replace a value in MultiIndex (pandas)
                            
                                AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'
                            
                                How to Python split by a character yet maintain that character?
                            
                                Django template rendering speed
                            
                                Animate a python pyplot by moving a point plotted via scatter
                            
                                Python Numpy Loadtxt - Convert unix timestamp
                            
                                String regex two mismatches Python
                            
                                How to skip directories in os walk Python 2.7
                            
                                jinja2: TemplateSyntaxError: expected token ',', got 'string'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With