Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

shuffle one column based upon factors in another R

Tags:

r

I created a dataframe to illustrate my problem. I am relatively new to R.

    #### permutation problem

a <- c("beagle", "beagle", "beagle", "basset", "basset")
b <- c(44, 33, 22, 34, 42)
c <- c(1:5)
d <- c(7:11)

dogframe <- data.frame(cbind(a,b,c,d))


output
> dogframe
       a  b c  d
1 beagle 44 1  7
2 beagle 33 2  8
3 beagle 22 3  9
4 basset 34 4 10
5 basset 42 5 11
> 

What I want to do is randomly shuffle column b by the factors in column a. So the values 44,33 and 22 will be shuffled for "beagle" and 34 and 42 will be shuffled for basset. I want the result to be a dataframe resembling the original with only shuffled values in column b.

Thanks.

like image 528
user2795569 Avatar asked Sep 19 '13 13:09

user2795569


2 Answers

Like this:

dogframe$b <- ave(dogframe$b, dogframe$a, FUN = sample)

which you can also write:

dogframe$b <- with(dogframe, ave(b, a, FUN = sample))
like image 182
flodel Avatar answered Sep 19 '22 23:09

flodel


Ok, you have base and plyr solutions already. Here's the third alternative in questions like this:

require(data.table)
DT <- data.table(dogframe)

DT[,b:=sample(b),by=a]

This overwrites the b column; if you wanted it in a separate copy, you'd do:

DT2 <- copy(DT)[,b:=sample(b),by=a]
like image 30
Frank Avatar answered Sep 20 '22 23:09

Frank