Why Numpy has dimension (n,) instead of (n,1) only [duplicate]

Tags:

I have been curious about this for some time. I can live with that, but it always bites me when enough care is not taken, so I decide to post it here. Suppose the following example (Numpy version = 1.8.2):

a = array([[0, 1], [2, 3]])
print shape(a[0:0, :]) # (0, 2)
print shape(a[0:1, :]) # (1, 2)
print shape(a[0:2, :]) # (2, 2)
print shape(a[0:100, :]) # (2, 2)

print shape(a[0]) # (2, )
print shape(a[0, :]) # (2, )
print shape(a[:, 0]) # (2, )

I don't know how other people feel, but the result feels inconsistent to me. The last line is a column vector while the second to last line is a row vector, they should have different dimension -- in linear algebra they do! (Line 5 is another surprise, but I will neglect it for now). Consider a second example:

solution = scipy.sparse.linalg.dsolve.linsolve.spsolve(A, b) # solution of dimension (n, )
analytic = reshape(f(x, y), (n, 1)) # analytic of dimension (n, 1)
error = solution - analytic

Now error is of dimension (n, n). Yes, in the second line I should use (n, ) instead of (n, 1), but why? I used to use MATLAB a lot, where one-d vector has dimension (n, 1), linspace/arange returns array of dimension (n, 1), and there never exists (n, ). But in Numpy (n, 1) and (n, ) coexist, and there are many functions for dimension handling alone: atleast, newaxis and different uses of reshape, but to me those functions are more of confusion than help. If an array print like [1,2,3], then intuitively the dimension should be [1,3] instead of [3,], right? If Numpy does not have (n, ), I can only see a gain in clarity, not a loss in functionality.

So there must be some design reason behind this. I have been searching from time to time, without finding a clear answer or report. Could someone help clarifying this confusion or provide me some useful references? Your help is much appreciated.

621

asked Jan 09 '15 17:01

Taozi

1 Answers

numpy's philosphy is not that a[:, 0] is a "column vector" and a[0, :] a "row vector" in the general case. Rather they are both, quite simply, vectors—i.e. arrays with one and only one dimension. This is actually highly logical and consistent (but yes, can get annoying for those of us accustomed to Matlab).

I say "in the general case" because that is true for numpy's most general data structure, the array, which is intended for all kinds of multi-dimensional dense data storage and manipulation applications—not just matrix math. Having "rows" and "columns" is a highly specialized context for array operations—but yes, a very common one: that's why numpy also supplies the matrix class. Convert your array to a numpy.matrix (or use the matrix constructor instead of array to begin with) and you will see behaviour closer to what you expect. For more information, see What are the differences between numpy arrays and matrices? Which one should I use?

For cases where you're dealing with more than 2 dimensions, take a look at the numpy.expand_dims function. Though the syntax is annoyingly redundant and unpythonically verbose, when I'm working on arrays with more than 2 dimensions (so cannot use matrix), I'm forever having to use expand_dims to do this kind of thing:

A -= numpy.expand_dims( A.mean( axis=2 ), 2 )   # subtract mean-across-layers from A

instead of

A -= A.mean( axis=2 )   # throw an exception while naively attempting to subtract mean-across-layers from A

But consider Matlab, by contrast. Matlab implicitly asserts that there is no such thing as a one-dimensional object and that the minimum number of dimensions a thing can ever have is 2. Sure, you and I are both highly accustomed to this, but take a moment to realize how arbitrary it is. There is clearly a conceptual difference between a fundamentally one-dimensional object, and a two-dimensional object that just happens to have extent 1 in one of its dimensions: the latter is allowed to grow in its second dimension, whereas the former doesn't even know what the second dimension means—and why should it? Hence a.shape==(N,) and a.shape==(N,1) make perfect sense as separate cases. You might as well ask "why is it not (N, 1, 1)?" or "why is it not (N, 1, 1, 1, 1, 1, 1)?"

answered Oct 16 '22 04:10

jez

Related questions
                            
                                Original tweet or retweeted?
                            
                                Reverse one edge in networkx graph
                            
                                numpy.tile a non-integer number of times
                            
                                Extracting unsigned char from array of numpy.uint8
                            
                                How can I set up Celery to call a custom worker initialization?
                            
                                pyspark: Save schemaRDD as json file
                            
                                Python pandas to_sql 'append'
                            
                                What is the correct ordering of Django middleware?
                            
                                Convert CSV to YAML, with Unicode?
                            
                                Disable styling on Google Search with Selenium FirefoxDriver
                            
                                Python: Matplotlib avoid plotting gaps
                            
                                Index numpy nd array along last dimension
                            
                                How to convert numpy array to R matrix? [duplicate]
                            
                                Use Pandas string method 'contains' on a Series containing lists of strings
                            
                                How to clip polar plot in pylab/pyplot
                            
                                Remove NaN row from X array and also the corresponding row in Y
                            
                                subprocess stdin buffer not flushing on newline with bufsize=1
                            
                                logging - how to ignore imported module logs?
                            
                                Binary Subtraction - Python
                            
                                Decimal field rounding in WTForms

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why Numpy has dimension (n,) instead of (n,1) only [duplicate]

Tags:

python

arrays

numpy

Taozi

People also ask

1 Answers

jez

Recent Activity

Donate For Us