In python numpy package, I am having trouble understanding the situation where an ndarray has the 2nd dimension being empty. Here is an example:
In[1]: d2 = np.random.rand(10)
In[2]: d2.shape = (-1, 1)
In[3]: print d2.shape
In[4]: print(d2)
In[5]: print d2[::2, 0].shape
In[6]: print d2[::2, 0]
Out[3]:(10, 1)
Out[4]:
[[ 0.12362278]
[ 0.26365227]
[ 0.33939172]
[ 0.91501369]
[ 0.97008342]
[ 0.95294087]
[ 0.38906367]
[ 0.1012371 ]
[ 0.67842086]
[ 0.23711077]]
Out[5]: (5,)
Out[6]: [ 0.12362278 0.33939172 0.97008342 0.38906367 0.67842086]
My understanding is that d2 is a 10 rows by 1 column ndarray. Out[6] is obviously a 1 by 5 array, how can the dimensions be (5,) ? What does the empty 2nd dimension mean?
Use ndim attribute available with numpy array as numpy_array_name. ndim to get the number of dimensions. Alternatively, we can use shape attribute to get the size of each dimension and then use len() function for the number of dimensions.
len() is the Python built-in function that returns the number of elements in a list or the number of characters in a string. For numpy. ndarray , len() returns the size of the first dimension. Equivalent to shape[0] and also equal to size only for one-dimensional arrays.
empty() in Python. The numpy module of Python provides a function called numpy. empty(). This function is used to create an array without initializing the entries of given shape and type.
Let me just give you one example that illustrate one important difference.
d1 = np.array([1,2,3,4,5]) # array([1, 2, 3, 4, 5])
d1.shape -> (5,) # row array.
d1.size -> 5
# Note: d1.T is the same as d1.
d2 = d1[np.newaxis] # array([[1, 2, 3, 4, 5]]). Note extra []
d2.shape -> (1,5)
d2.size -> 5
# Note: d2.T will give a column array
array([[1],
[2],
[3],
[4],
[5]])
d2.T.shape -> (5,1)
I also thought ndarrays would represent even 1-d arrays as 2-d arrays with a thickness of 1. Maybe because of the name "ndarray" makes us think high dimensional, however, n can be 1, so ndarrays can just have one dimension.
Compare these
x = np.array([[1], [2], [3], [4]])
x.shape
# (4, 1)
x = np.array([[1, 2, 3, 4]])
x.shape
#(1, 4)
x = np.array([1, 2, 3, 4])
x.shape
#(4,)
and (4,) means (4).
If I reshape x and back to (4), it comes back to original
x.shape = (2,2)
x
# array([[1, 2],
# [3, 4]])
x.shape = (4)
x
# array([1, 2, 3, 4])
The main thing to understand here is that indexing with an integer is different than indexing with a slice. For example, when you index a 1d array or a list with an integer you get a scalar but when you index with a slice, you get an array or a list respectively. The same thing applies to 2d+ arrays. So for example:
# Make a 3d array:
import numpy as np
array = np.arange(60).reshape((3, 4, 5))
# Indexing with ints gives a scalar
print array[2, 3, 4] == 59
# True
# Indexing with slices gives a 3d array
print array[:2, :2, :2].shape
# (2, 2, 2)
# Indexing with a mix of slices and ints will give an array with < 3 dims
print array[0, :2, :3].shape
# (2, 3)
print array[:, 2, 0:1].shape
# (3, 1)
This can be really useful conceptually, because sometimes its great to think of an array as a collection of vectors, for example I can represent N points in space as an (N, 3) array:
n_points = np.random.random([10, 3])
point_2 = n_points[2]
print all(point_2 == n_points[2, :])
# True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With