Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select cells randomly from NumPy array - without replacement

I'm writing some modelling routines in NumPy that need to select cells randomly from a NumPy array and do some processing on them. All cells must be selected without replacement (as in, once a cell has been selected it can't be selected again, but all cells must be selected by the end).

I'm transitioning from IDL where I can find a nice way to do this, but I assume that NumPy has a nice way to do this too. What would you suggest?

Update: I should have stated that I'm trying to do this on 2D arrays, and therefore get a set of 2D indices back.

like image 965
robintw Avatar asked Oct 08 '10 13:10

robintw


People also ask

How do you randomly select an element from a NumPy array?

choice() function is used to get random elements from a NumPy array. It is a built-in function in the NumPy package of python. Parameters: a: a one-dimensional array/list (random sample will be generated from its elements) or an integer (random samples will be generated in the range of this integer)

How do you randomly select data in an array in python?

Use the numpy. random. choice() function to generate the random choices and samples from a NumPy multidimensional array. Using this function we can get single or multiple random numbers from the n-dimensional array with or without replacement.

How do I select a few rows from a NumPy array?

We can use [][] operator to select an element from Numpy Array i.e. Example 1: Select the element at row index 1 and column index 2. Or we can pass the comma separated list of indices representing row index & column index too i.e.

What is NumPy random choice?

NumPy random choice helps you create random samples. One common task in data analysis, statistics, and related fields is taking random samples of data. You'll see random samples in probability, Bayesian statistics, machine learning, and other subjects. Random samples are very common in data-related fields.


1 Answers

All of these answers seemed a little convoluted to me.

I'm assuming that you have a multi-dimensional array from which you want to generate an exhaustive list of indices. You'd like these indices shuffled so you can then access each of the array elements in a randomly order.

The following code will do this in a simple and straight-forward manner:

#!/usr/bin/python
import numpy as np

#Define a two-dimensional array
#Use any number of dimensions, and dimensions of any size
d=numpy.zeros(30).reshape((5,6))

#Get a list of indices for an array of this shape
indices=list(np.ndindex(d.shape))

#Shuffle the indices in-place
np.random.shuffle(indices)

#Access array elements using the indices to do cool stuff
for i in indices:
  d[i]=5

print d

Printing d verified that all elements have been accessed.

Note that the array can have any number of dimensions and that the dimensions can be of any size.

The only downside to this approach is that if d is large, then indices may become pretty sizable. Therefore, it would be nice to have a generator. Sadly, I can't think of how to build a shuffled iterator off-handedly.

like image 121
Richard Avatar answered Sep 24 '22 05:09

Richard