Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R function to return multiple data frames

Tags:

r

I have the following function to return 9 data frames:

split_data <- function(dataset, train_perc = 0.6, cv_perc = 0.2, test_perc = 0.2)

{

m <- nrow(dataset)
n <- ncol(dataset)

#Sort the data randomly
data_perm <- dataset[sample(m),]

#Split data into training, CV, and test sets
train <- data_perm[1:round(train_perc*m),]
cv <- data_perm[(round(train_perc*m)+1):round((train_perc+cv_perc)*m),]
test <- data_perm[(round((train_perc+cv_perc)*m)+1):round((train_perc+cv_perc+test_perc)*m),]

#Split sets into X and Y
X_train <- train[c(1:(n-1))]
Y_train <- train[c(n)]

X_cv    <- cv[c(1:(n-1))]
Y_cv    <- cv[c(n)]

X_test  <- test[c(1:(n-1))]
Y_test <- test[c(n)]

}

My code runs fine, but no data frames are created. Is there a way to do this? Thanks

like image 665
Ben Avatar asked Jan 25 '17 15:01

Ben


People also ask

Can a function return two Dataframes in R?

yeah? Well, you can't. You get an error that R cannot return multiple values.

Can R function return multiple values?

1 Answer. In R programming, functions do not return multiple values, however, you can create a list that contains multiple objects that you want a function to return.

Can you Rbind multiple data frames in R?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.

Can you make a list of data frames in R?

Creating a list of Dataframes. To create a list of Dataframes we use the list() function in R and then pass each of the data frame you have created as arguments to the function.


2 Answers

This will store the nine data.frames in a list

split_data <- function(dataset, train_perc = 0.6, cv_perc = 0.2, test_perc = 0.2) {

  m <- nrow(dataset)
  n <- ncol(dataset)

  #Sort the data randomly
  data_perm <- dataset[sample(m),]

  # list to store all data.frames
  out <- list()

  #Split data into training, CV, and test sets
  out$train <- data_perm[1:round(train_perc*m),]
  out$cv <- data_perm[(round(train_perc*m)+1):round((train_perc+cv_perc)*m),]
  out$test <- data_perm[(round((train_perc+cv_perc)*m)+1):round((train_perc+cv_perc+test_perc)*m),]

  #Split sets into X and Y
  out$X_train <- train[c(1:(n-1))]
  out$Y_train <- train[c(n)]

  out$X_cv <- cv[c(1:(n-1))]
  out$Y_cv <- cv[c(n)]

  out$X_test <- test[c(1:(n-1))]
  out$Y_test <- test[c(n)]

  return(out)

}
like image 154
manotheshark Avatar answered Sep 25 '22 12:09

manotheshark


If you want dataframes to be created in the workspace at the end, this is what you'll need to do:-

1) Create empty variable (which may equal out to NULL i.e. Y_test = NULL) in your R console. 
2) Assign "<<-" operator to the same variables created in Step 1 inside your function i.e.

X_train <<- train[c(1:(n-1))]
Y_train <<- train[c(n)]

X_cv    <<- cv[c(1:(n-1))]
Y_cv    <<- cv[c(n)]

X_test  <<- test[c(1:(n-1))]
Y_test <<- test[c(n)]

This shall make you access the newly created data from your workspace.

like image 41
Abdul Basit Khan Avatar answered Sep 21 '22 12:09

Abdul Basit Khan