Related to this question.
gender <- c("F", "M", "M", "F", "F", "M", "F", "F")
age <- c(23, 25, 27, 29, 31, 33, 35, 37)
mydf <- data.frame(gender, age)
mydf[ sample( which(mydf$gender=='F'), 3 ), ]
Instead of selecting a number of rows (3 in above case), how can I randomly select 20% of rows with "F"? So of the five rows with "F", how do I randomly sample 20% of those rows.
Sample_frac() function selects a random n percentage of rows from a dataframe or table, the use of this function is similar to the sample_n() function, and this function is widely used in the R programming language.
Use the numpy. random. choice() function to pick multiple random rows from the multidimensional array.
You can use sample_frac()
function in dplyr
package.
e.g. If you want to sample 20 % within each group:
mydf %>% sample_frac(.2)
If you want to sample 20 % within each gender group:
mydf %>% group_by(gender) %>% sample_frac(.2)
How about this:
mydf[ sample( which(mydf$gender=='F'), round(0.2*length(which(mydf$gender=='F')))), ]
Where 0.2 is your 20% and length(which(mydf$gender=='F'))
is the total number of rows with F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With