Objective: Randomly divide a data frame into 3 samples. <ul> <li>one sample with 60% of the rows</li> <li>other two samples with 20% of the rows </li> <li>samples should not have duplicates of others (i.e. sample without replacement).</li> </ul> Here's a clunky solution: <pre class="prettyprint"><code>allrows <- 1:nrow(mtcars) set.seed(7) trainrows <- sample(allrows, replace = F, size = 0.6*length(allrows)) test_cvrows <- allrows[-trainrows] testrows <- sample(test_cvrows, replace=F, size = 0.5*length(test_cvrows)) cvrows <- test_cvrows[-which(test_cvrows %in% testrows)] train <- mtcars[trainrows,] test <- mtcars[testrows,] cvr <- mtcars[cvrows,] </code></pre> There must be something easier, perhaps in a package. <code>dplyr</code> has the <code>sample_frac</code> function, but that seems to target a single sample, not a split into multiple. Close, but not quite the answer to this question: Random Sample with multiple probabilities in R

Do you need the partitioning to be exact? If not, <pre class="prettyprint"><code>set.seed(7) ss <- sample(1:3,size=nrow(mtcars),replace=TRUE,prob=c(0.6,0.2,0.2)) train <- mtcars[ss==1,] test <- mtcars[ss==2,] cvr <- mtcars[ss==3,] </code></pre> should do it. Or, as @Frank says in comments, you can <code>split()</code> the original data to keep them as elements of a list: <pre class="prettyprint"><code>mycars <- setNames(split(mtcars,ss), c("train","test","cvr")) </code></pre>

Randomly sample data frame into 3 groups in R

Objective: Randomly divide a data frame into 3 samples.

one sample with 60% of the rows
other two samples with 20% of the rows
samples should not have duplicates of others (i.e. sample without replacement).

Here's a clunky solution:

allrows <- 1:nrow(mtcars)

set.seed(7)
trainrows <- sample(allrows, replace = F, size = 0.6*length(allrows))
test_cvrows <- allrows[-trainrows]
testrows <- sample(test_cvrows, replace=F, size = 0.5*length(test_cvrows))
cvrows <- test_cvrows[-which(test_cvrows %in% testrows)]

train <- mtcars[trainrows,]
test <- mtcars[testrows,]
cvr <- mtcars[cvrows,]

There must be something easier, perhaps in a package. dplyr has the sample_frac function, but that seems to target a single sample, not a split into multiple.

Close, but not quite the answer to this question: Random Sample with multiple probabilities in R

How do you randomly select samples in R?

Sample_n() function is used to select n random rows from a dataframe in R.

How do you split data into a group in R?

Split() is a built-in R function that divides a vector or data frame into groups according to the function's parameters. It takes a vector or data frame as an argument and divides the information into groups. The syntax for this function is as follows: split(x, f, drop = FALSE, ...)

How does sample work in R?

Sample() function is used to generate the random elements from the given data with or without replacement. where, data can be a vector or a dataframe. size represents the size of the sample.

Do you need the partitioning to be exact? If not,

set.seed(7)
ss <- sample(1:3,size=nrow(mtcars),replace=TRUE,prob=c(0.6,0.2,0.2))
train <- mtcars[ss==1,]
test <- mtcars[ss==2,]
cvr <- mtcars[ss==3,]

should do it.

Or, as @Frank says in comments, you can split() the original data to keep them as elements of a list:

mycars <- setNames(split(mtcars,ss), c("train","test","cvr"))

Not the prettiest solution (especially for larger samples), but it works.

n = nrow(mtcars)
#use different rounding for differet sizes/proportions
times =rep(1:3,c(0.6*n,0.2*n,0.2*n))
ntimes = length(times)
if (ntimes < n)
    times = c(times,sample(1:3,n-ntimes,prob=c(0.6,0.2,0.2),replace=FALSE))
sets = sample(times)
df1 = mtcars[sets==1,]
df2 = mtcars[sets==2,]
df3 = mtcars[sets==3,]

Randomly sample data frame into 3 groups in R

Tags:

random

r

random-sample

Minnow

People also ask

2 Answers

Ben Bolker

Max Candocia

Recent Activity

Donate For Us

Randomly sample data frame into 3 groups in R

Tags:

random

r

random-sample

Minnow

People also ask

2 Answers

Ben Bolker

Max Candocia

Related questions

Recent Activity

Donate For Us