How can I select all of the rows for a random sample of column values?
I have a dataframe that looks like this:
tag weight
R007 10
R007 11
R007 9
J102 11
J102 9
J102 13
J102 10
M942 3
M054 9
M054 12
V671 12
V671 13
V671 9
V671 12
Z990 10
Z990 11
That you can replicate using...
weights_df <- structure(list(tag = structure(c(4L, 4L, 4L, 1L, 1L, 1L, 1L,
3L, 2L, 2L, 5L, 5L, 5L, 5L, 6L, 6L), .Label = c("J102", "M054",
"M942", "R007", "V671", "Z990"), class = "factor"), value = c(10L,
11L, 9L, 11L, 9L, 13L, 10L, 3L, 9L, 12L, 12L, 14L, 5L, 12L, 11L,
15L)), .Names = c("tag", "value"), class = "data.frame", row.names = c(NA,
-16L))
I need to create a dataframe containing all of the rows from the above dataframe for two randomly sampled tags. Let's say tags R007and M942 get selected at random, my new dataframe needs to look like this:
tag weight
R007 10
R007 11
R007 9
M942 3
How do I do this?
I know I can create a list of two random tags like this:
library(plyr)
tags <- ddply(weights_df, .(tag), summarise, count = length(tag))
set.seed(5464)
tag_sample <- tags[sample(nrow(tags),2),]
tag_sample
Resulting in...
tag count
4 R007 3
3 M942 1
But I just don't know how to use that to subset my original dataframe.
is this what you want?
subset(weights_df, tag%in%sample(levels(tag),2))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With