I have this dataframe
id <- c(1,1,1,2,2,3)
name <- c("A","A","A","B","B","C")
value <- c(7:12)
df<- data.frame(id=id, name=name, value=value)
df
This function selects a random row from it:
randomRows = function(df,n){
return(df[sample(nrow(df),n),])
}
i.e.
randomRows(df,1)
But I want to randomly select one row per 'name' (or per 'id' which is the same) and concatenate that entire row into a new table, so in this case, three rows. This has to loop throught a 2000+ rows dataframe. Please show me how?!
Sample_n() function is used to select n random rows from a dataframe in R.
By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.
I think you can do this with the plyr
package:
library("plyr")
ddply(df,.(name),randomRows,1)
which gives you for example:
id name value
1 1 A 8
2 2 B 11
3 3 C 12
Is this what you are looking for?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With