Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete rows at select indexes from a numpy array

In my dataset I've close to 200 rows but for a minimal working e.g., let's assume the following array:

arr = np.array([[1,2,3,4], [5,6,7,8], 
               [9,10,11,12], [13,14,15,16], 
               [17,18,19,20], [21,22,23,24]])

I can take a random sampling of 3 of the rows as follows:

indexes = np.random.choice(np.arange(arr.shape[0]), int(arr.shape[0]/2), replace=False)

Using these indexes, I can select my test cases as follows:

testing = arr[indexes]

I want to delete the rows at these indexes and I can use the remaining elements for my training set.

From the post here, it seems that training = np.delete(arr, indexes) ought to do it. But I get 1d array instead.

I also tried the suggestion here using training = arr[indexes.astype(np.bool)] but it did not give a clean separation. I get element [5,6,7,8] in both the training and testing sets.

training = arr[indexes.astype(np.bool)]

testing
Out[101]: 
array([[13, 14, 15, 16],
       [ 5,  6,  7,  8],
       [17, 18, 19, 20]])

training
Out[102]: 
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Any idea what I am doing wrong? Thanks.

like image 501
sedeh Avatar asked May 20 '15 05:05

sedeh


1 Answers

To delete indexed rows from numpy array:

arr = np.delete(arr, indexes, axis=0)
like image 112
farhawa Avatar answered Sep 21 '22 02:09

farhawa