I want to know if Python has an equivalent to the sample()
function in R.
The sample() function takes a sample of the specified size from the elements of x using either with or without replacement.
The syntax is:
sample(x, size, replace = FALSE, prob = NULL)
(More information here)
I think numpy.random.choice(a, size=None, replace=True, p=None)
may well be what you are looking for.
The p
argument corresponds to the prob
argument in the sample()
function.
In pandas (Python's closest analogue to R) there are the DataFrame.sample
and Series.sample
methods, which were both introduced in version 0.16.1.
For example:
>>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [6, 7, 8, 9, 0]})
>>> df
a b
0 1 6
1 2 7
2 3 8
3 4 9
4 5 0
Sampling 3 rows without replacement:
>>> df.sample(3)
a b
4 5 0
1 2 7
3 4 9
Sample 4 rows from column 'a' with replacement, using column 'b' as the corresponding weights for the choices:
>>> df['a'].sample(4, replace=True, weights=df['b'])
3 4
0 1
0 1
2 3
These methods are almost identical to the R function, allowing you to sample a particular number of values - or fraction of values - from your DataFrame/Series, with or without replacement. Note that the prob
argument in R's sample()
corresponds to weights
in the pandas methods.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With