Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

confused about numpy.c_ document and sample code

I read the document about numpy.c_ many times but still confused. It is said -- "Translates slice objects to concatenation along the second axis." in the following document. Could anyone clarify in the example below, what is slice objects, and what is 2nd axis? I see they are all one dimension and confused where the 2nd axis coming from.

Using Python 2.7 on Windows.

http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.c_.html#numpy.c_

>>> np.c_[np.array([[1,2,3]]), 0, 0, np.array([[4,5,6]])]
array([[1, 2, 3, 0, 0, 4, 5, 6]])
like image 261
Lin Ma Avatar asked Aug 25 '16 04:08

Lin Ma


1 Answers

np.c_ is another way of doing array concatenate

In [701]: np.c_[np.array([[1,2,3]]), 0, 0, np.array([[4,5,6]])]
Out[701]: array([[1, 2, 3, 0, 0, 4, 5, 6]])

In [702]: np.concatenate([np.array([[1,2,3]]), [[0]], [[0]], np.array([[4,5,6]])], 
     axis=1)
Out[702]: array([[1, 2, 3, 0, 0, 4, 5, 6]])

The output shape is (1,8) in both cases; the concatenation was on axis=1, the 2nd axis.

c_ took care of expanding the dimensions of the 0 to np.array([[0]]), the 2d (1,1) needed to concatenate.

np.c_ (and np.r_) is actually a class object with a __getitem__ method, so it works with the [] syntax. The numpy/lib/index_tricks.py source file is instructive reading.

Note that the row version works with the : slice syntax, producing a 1d (8,) array (same numbers, but in 1d)

In [706]: np.r_[1:4,0,0,4:7]
Out[706]: array([1, 2, 3, 0, 0, 4, 5, 6])
In [708]: np.concatenate((np.arange(4),[0],[0],np.arange(4,7)))
Out[708]: array([0, 1, 2, 3, 0, 0, 4, 5, 6])
In [710]: np.hstack((np.arange(4),0,0,np.arange(4,7)))
Out[710]: array([0, 1, 2, 3, 0, 0, 4, 5, 6])

np.c_ is a convenience, but not something you are required to understand. I think being able to work with concatenate directly is more useful. It forces you to think explicitly about the dimensions of the inputs.

[[1,2,3]] is actually a list - a list containing one list. np.array([[1,2,3]]) is a 2d array with shape (1,3). np.arange(1,4) produces a (3,) array with the same numbers. np.arange(1,4)[None,:] makes it a (1,3) array.

slice(1,4) is a slice object. np.r_ and np.c_ can turn a slice object into a array - by actually using np.arange.

In [713]: slice(1,4)
Out[713]: slice(1, 4, None)
In [714]: np.r_[slice(1,4)]
Out[714]: array([1, 2, 3])
In [715]: np.c_[slice(1,4)]   # (3,1) array
Out[715]: 
array([[1],
       [2],
       [3]])
In [716]: np.c_[1:4]   # equivalent with the : notation
Out[716]: 
array([[1],
       [2],
       [3]])

And to get back to the original example (which might not be the best):

In [722]: np.c_[[np.r_[1:4]],0,0,[np.r_[4:7]]]
Out[722]: array([[1, 2, 3, 0, 0, 4, 5, 6]])

==========

In [731]: np.c_[np.ones((5,3)),np.random.randn(5,10)].shape
Out[731]: (5, 13)

For np.c_ the 1st dimension of both needs to match.

In the learn example, n_samples is the 1st dim of X (rows), and the randn also needs to have that many rows.

n_samples, n_features = X.shape
X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]

np.concatenate([(X, randn(n_samples...)], axis=1) should work just as well here. A little wordier, but functionally the same.

like image 186
hpaulj Avatar answered Nov 15 '22 10:11

hpaulj