I am attempting to create a simple loop in R, where I have a large dataset and I want to create multiple smaller samples from this dataset and export them to excel:
I thought it would work like this, but it doesn't:
idorg <- c(1,2,3,4,5)
x <- c(14,20,21,16,17)
y <- c(31,21,20,50,13)
dataset <- cbind (idorg,x,y)
for (i in 1:4)
{
attempt[i] <- dataset[sample(1:nrow(dataset), 3, replace=FALSE),]
write.table(attempt[i], "C:/Users/me/Desktop/WWD/Excel/dataset[i].xls", sep='\t')
}
In Stata you would need to preserve and restore your data when doing a loop like this, but is this also necessary in R?
You have following problems:
attempt[i]
cannot be assigned to. Either make it a matrix to fill up within the loop (if you want to keep the samples), or use it as a temporary variable attempt
.paste()
or sprintf()
to include the value of the variable i
in the file name.Here is a working version of the code:
idorg <- c(1,2,3,4,5)
x <- c(14,20,21,16,17)
y <- c(31,21,20,50,13)
dataset <- cbind (idorg,x,y)
for (i in 1:4) {
attempt <- dataset[sample(1:nrow(dataset), 3, replace=FALSE),]
write.table(attempt, sprintf( "C:/Users/me/Desktop/WWD/Excel/dataset[%d].xls", i ), sep='\t')
}
Will Excel be able to read such a tab-separated table? I'm not sure; I would make a comma separated table and save it as .csv
.
Unlike Stata, you don't need to preserve and restore your data for this kind of operation in R.
I think January's solution solves your problem, but I wanted to share another alternative: using lapply()
to get a list of all the samples of the dataset:
set.seed(1) # So you can reproduce these results
temp <- setNames(lapply(1:4,
function(x) {
x <- dataset[sample(1:nrow(dataset),
3, replace = FALSE), ]; x }),
paste0("attempt.", 1:4))
This has created a list()
named "temp" that comprises four data.frame
s.
temp
# $attempt.1
# idorg x y
# [1,] 2 20 21
# [2,] 5 17 13
# [3,] 4 16 50
#
# $attempt.2
# idorg x y
# [1,] 5 17 13
# [2,] 1 14 31
# [3,] 3 21 20
#
# $attempt.3
# idorg x y
# [1,] 5 17 13
# [2,] 3 21 20
# [3,] 2 20 21
#
# $attempt.4
# idorg x y
# [1,] 1 14 31
# [2,] 5 17 13
# [3,] 4 16 50
Lists are very convenient in R. You can now use lapply()
to do other fun things, like if you wanted to find out the row sums, you can do lapply(temp, rowSums)
. Or, if you wanted to output separate CSV files (readable by Excel), you can do something like this:
lapply(names(temp), function(x) write.csv(temp[[x]],
file = paste0(x, ".csv")))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With