Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random row selection in R

Tags:

r

I have this dataframe

id <- c(1,1,1,2,2,3)
name <- c("A","A","A","B","B","C")
value <- c(7:12)
df<- data.frame(id=id, name=name, value=value)
df

This function selects a random row from it:

randomRows = function(df,n){
  return(df[sample(nrow(df),n),])
}

i.e.

randomRows(df,1)

But I want to randomly select one row per 'name' (or per 'id' which is the same) and concatenate that entire row into a new table, so in this case, three rows. This has to loop throught a 2000+ rows dataframe. Please show me how?!

like image 522
Bernard Avatar asked Apr 04 '12 11:04

Bernard


People also ask

How do I select a random row in R?

Sample_n() function is used to select n random rows from a dataframe in R.

How do I select data rows in R?

By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.


1 Answers

I think you can do this with the plyr package:

library("plyr")
ddply(df,.(name),randomRows,1)

which gives you for example:

  id name value
1  1    A     8
2  2    B    11
3  3    C    12

Is this what you are looking for?

like image 154
Sacha Epskamp Avatar answered Oct 04 '22 03:10

Sacha Epskamp