Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Python equivalent to R's sample() function?

I want to know if Python has an equivalent to the sample() function in R.

The sample() function takes a sample of the specified size from the elements of x using either with or without replacement.

The syntax is:

sample(x, size, replace = FALSE, prob = NULL)

(More information here)

like image 796
Bilal Avatar asked Dec 03 '15 22:12

Bilal


2 Answers

I think numpy.random.choice(a, size=None, replace=True, p=None) may well be what you are looking for.

The p argument corresponds to the prob argument in the sample()function.

like image 54
Julian Wittische Avatar answered Nov 01 '22 12:11

Julian Wittische


In pandas (Python's closest analogue to R) there are the DataFrame.sample and Series.sample methods, which were both introduced in version 0.16.1.

For example:

>>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [6, 7, 8, 9, 0]})
>>> df
   a  b
0  1  6
1  2  7
2  3  8
3  4  9
4  5  0

Sampling 3 rows without replacement:

>>> df.sample(3)
   a  b
4  5  0
1  2  7
3  4  9

Sample 4 rows from column 'a' with replacement, using column 'b' as the corresponding weights for the choices:

>>> df['a'].sample(4, replace=True, weights=df['b'])
3    4
0    1
0    1
2    3

These methods are almost identical to the R function, allowing you to sample a particular number of values - or fraction of values - from your DataFrame/Series, with or without replacement. Note that the prob argument in R's sample() corresponds to weights in the pandas methods.

like image 10
Alex Riley Avatar answered Nov 01 '22 12:11

Alex Riley