I have a dataset with 9558 rows from three different projects. I want to randomly split this dataset in three equal groups and assign a unique ID for each group, so that Project1_Project_2_Project3
becomes Project1
, Project2
and Project3
.
I have tried many things, and googled codes from people with similar problem as I have. I have used sample_n()
and sample_frac()
, but unfortunately I can't solve this issue myself :/
I have made an example of my dataset looking like this:
ProjectName <- c("Project1_Project2_Project3")
data <- data.frame(replicate(10,sample(0:1,9558,rep=TRUE)))
data <- data.frame(ProjectName, data)
And the output should be randomly split in three equal group of nrow=3186
and then assigned to the values
ProjectName Count of rows
Project1 3186
Project2 3186
Project3 3186
IMO it should be sufficient to assign just random project names.
dat$ProjectName <- sample(factor(rep(1:3, length.out=nrow(dat)),
labels=paste0("Project", 1:3)))
Result
head(dat)
# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 ProjectName
# 1 1 1 0 1 1 1 1 0 1 0 Project1
# 2 1 1 1 1 1 1 0 0 1 0 Project1
# 3 0 0 1 1 0 0 0 1 1 1 Project1
# 4 1 1 1 0 1 0 1 1 0 1 Project3
# 5 1 0 0 1 1 1 1 0 0 1 Project1
# 6 1 0 0 0 0 1 0 1 1 1 Project3
table(dat$ProjectName)
# Project1 Project2 Project3
# 3186 3186 3186
Data
set.seed(42)
dat <- data.frame(replicate(10, sample(0:1, 9558, rep=TRUE)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With