Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Swapping columns in a numpy array?

Tags:

python

numpy

from numpy import * def swap_columns(my_array, col1, col2):     temp = my_array[:,col1]     my_array[:,col1] = my_array[:,col2]     my_array[:,col2] = temp 

Then

swap_columns(data, 0, 1) 

Doesn't work. However, calling the code directly

temp = my_array[:,0] my_array[:,0] = my_array[:,1] my_array[:,1] = temp 

Does. Why is this happening and how can I fix it? The Error says "IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index", which implies the arguments aren't ints? I already tried converting the cols to int but that didn't solve it.

like image 692
audacious ainsley Avatar asked Feb 01 '11 01:02

audacious ainsley


2 Answers

There are two issues here. The first is that the data you pass to your function apparently isn't a two-dimensional NumPy array -- at least this is what the error message says.

The second issue is that the code does not do what you expect:

my_array = numpy.arange(9).reshape(3, 3) # array([[0, 1, 2], #        [3, 4, 5], #        [6, 7, 8]]) temp = my_array[:, 0] my_array[:, 0] = my_array[:, 1] my_array[:, 1] = temp # array([[1, 1, 2], #        [4, 4, 5], #        [7, 7, 8]]) 

The problem is that Numpy basic slicing does not create copies of the actual data, but rather a view to the same data. To make this work, you either have to copy explicitly

temp = numpy.copy(my_array[:, 0]) my_array[:, 0] = my_array[:, 1] my_array[:, 1] = temp 

or use advanced slicing

my_array[:,[0, 1]] = my_array[:,[1, 0]] 
like image 148
Sven Marnach Avatar answered Oct 03 '22 23:10

Sven Marnach


I find the following the fastest:

my_array[:, 0], my_array[:, 1] = my_array[:, 1], my_array[:, 0].copy() 

Time analysis of:

import numpy as np my_array = np.arange(900).reshape(30, 30) 

is as follows:

%timeit my_array[:, 0], my_array[:, 1] = my_array[:, 1], my_array[:, 0].copy() The slowest run took 15.05 times longer than the fastest. This could mean that an intermediate result is being cached  1000000 loops, best of 3: 1.72 µs per loop 

The advanced slicing times are:

%timeit my_array[:,[0, 1]] = my_array[:,[1, 0]] The slowest run took 7.38 times longer than the fastest. This could mean that an intermediate result is being cached  100000 loops, best of 3: 6.9 µs per loop 
like image 28
blaz Avatar answered Oct 04 '22 00:10

blaz