numpy vstack vs. column_stack




What exactly is the difference between numpy vstack and column_stack. Reading through the documentation, it looks as if column_stack is an implementation of vstack for 1D arrays. Is it a more efficient implementation? Otherwise, I cannot find a reason for just having vstack.

2 Answers

I think the following code illustrates the difference nicely:

>>> np.vstack(([1,2,3],[4,5,6])) array([[1, 2, 3],        [4, 5, 6]]) >>> np.column_stack(([1,2,3],[4,5,6])) array([[1, 4],        [2, 5],        [3, 6]]) >>> np.hstack(([1,2,3],[4,5,6])) array([1, 2, 3, 4, 5, 6]) 

I've included hstack for comparison as well. Notice how column_stack stacks along the second dimension whereas vstack stacks along the first dimension. The equivalent to column_stack is the following hstack command:

>>> np.hstack(([[1],[2],[3]],[[4],[5],[6]])) array([[1, 4],        [2, 5],        [3, 6]]) 

I hope we can agree that column_stack is more convenient.

In the Notes section to column_stack, it points out this:

This function is equivalent to np.vstack(tup).T.

There are many functions in numpy that are convenient wrappers of other functions. For example, the Notes section of vstack says:

Equivalent to np.concatenate(tup, axis=0) if tup contains arrays that are at least 2-dimensional.

It looks like column_stack is just a convenience function for vstack.

