Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to understand empty dimension in python numpy array?

In python numpy package, I am having trouble understanding the situation where an ndarray has the 2nd dimension being empty. Here is an example:

    In[1]: d2 = np.random.rand(10)
    In[2]: d2.shape = (-1, 1)

    In[3]: print d2.shape
    In[4]: print(d2)

    In[5]: print d2[::2, 0].shape
    In[6]: print d2[::2, 0]

    Out[3]:(10, 1)
    Out[4]:
[[ 0.12362278]
 [ 0.26365227]
 [ 0.33939172]
 [ 0.91501369]
 [ 0.97008342]
 [ 0.95294087]
 [ 0.38906367]
 [ 0.1012371 ]
 [ 0.67842086]
 [ 0.23711077]]

    Out[5]: (5,)
    Out[6]: [ 0.12362278  0.33939172  0.97008342  0.38906367  0.67842086]

My understanding is that d2 is a 10 rows by 1 column ndarray. Out[6] is obviously a 1 by 5 array, how can the dimensions be (5,) ? What does the empty 2nd dimension mean?

like image 695
Kid_Learning_C Avatar asked Apr 04 '16 21:04

Kid_Learning_C


People also ask

How do you find the dimension of a numpy array?

Use ndim attribute available with numpy array as numpy_array_name. ndim to get the number of dimensions. Alternatively, we can use shape attribute to get the size of each dimension and then use len() function for the number of dimensions.

How do you check dimensions in Python?

len() is the Python built-in function that returns the number of elements in a list or the number of characters in a string. For numpy. ndarray , len() returns the size of the first dimension. Equivalent to shape[0] and also equal to size only for one-dimensional arrays.

What is numpy empty in Python?

empty() in Python. The numpy module of Python provides a function called numpy. empty(). This function is used to create an array without initializing the entries of given shape and type.


3 Answers

Let me just give you one example that illustrate one important difference.

d1 = np.array([1,2,3,4,5]) # array([1, 2, 3, 4, 5])
d1.shape -> (5,) # row array.    
d1.size -> 5
# Note: d1.T is the same as d1.

d2 = d1[np.newaxis] # array([[1, 2, 3, 4, 5]]). Note extra []
d2.shape -> (1,5) 
d2.size -> 5
# Note: d2.T will give a column array
array([[1],
       [2],
       [3],
       [4],
       [5]])
d2.T.shape -> (5,1)
like image 175
Hun Avatar answered Oct 22 '22 17:10

Hun


I also thought ndarrays would represent even 1-d arrays as 2-d arrays with a thickness of 1. Maybe because of the name "ndarray" makes us think high dimensional, however, n can be 1, so ndarrays can just have one dimension.

Compare these

x = np.array([[1], [2], [3], [4]])
x.shape
# (4, 1)
x = np.array([[1, 2, 3, 4]])
x.shape
#(1, 4)
x = np.array([1, 2, 3, 4])
x.shape
#(4,)

and (4,) means (4).

If I reshape x and back to (4), it comes back to original

x.shape = (2,2)
x
# array([[1, 2],
#       [3, 4]])
x.shape = (4)
x
# array([1, 2, 3, 4])
like image 40
Ozgur Ozturk Avatar answered Oct 22 '22 17:10

Ozgur Ozturk


The main thing to understand here is that indexing with an integer is different than indexing with a slice. For example, when you index a 1d array or a list with an integer you get a scalar but when you index with a slice, you get an array or a list respectively. The same thing applies to 2d+ arrays. So for example:

# Make a 3d array:
import numpy as np
array = np.arange(60).reshape((3, 4, 5))

# Indexing with ints gives a scalar
print array[2, 3, 4] == 59
# True

# Indexing with slices gives a 3d array
print array[:2, :2, :2].shape
# (2, 2, 2)

# Indexing with a mix of slices and ints will give an array with < 3 dims
print array[0, :2, :3].shape
# (2, 3)
print array[:, 2, 0:1].shape
# (3, 1)

This can be really useful conceptually, because sometimes its great to think of an array as a collection of vectors, for example I can represent N points in space as an (N, 3) array:

n_points = np.random.random([10, 3])
point_2 = n_points[2]
print all(point_2 == n_points[2, :])
# True
like image 1
Bi Rico Avatar answered Oct 22 '22 16:10

Bi Rico