Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass rows of a data frame as parameters to a function while keeping other arguments constant

Tags:

r

Following up on Pass rows of a data frame as arguments to a function in R with column names specifying the arguments:

I want to train the following model with different combinations of parameters:

library(xgboost)
library(Matrix)

df <- data.frame(y = sample(0:1, 1000, replace = TRUE),
                 a = rnorm(1000),
                 b = rnorm(1000),
                 c = rnorm(1000),
                 d = rnorm(1000))

train <- sparse.model.matrix(object = y~.-1, data = df)

model <- xgboost(data = train,
                 label = df$y,
                 # parameters
                 nrounds = 10, 
                 subsample = 0.5,
                 colsample_bytree = 0.8)

I created a grid with the parameters and I want to pass the rows of the grid into the xgboost function, while keeping data and label arguments constant.

param <- expand.grid(nrounds = c(10, 50, 100),
                     subsample = c(0.5, 0.8, 0.9),
                     colsample_bytree = c(0.8))

I would like to pass the arguments using the column names to specify them (if the column names is not an option, the order of the columns will do it as well), since this would make the call scalable for different functions.

like image 479
D Pinto Avatar asked Feb 08 '17 12:02

D Pinto


People also ask

Do you pass parameters or arguments?

Arguments are passed by value; that is, when a function is called, the parameter receives a copy of the argument's value, not its address. This rule applies to all scalar values, structures, and unions passed as arguments. Modifying a parameter does not modify the corresponding argument passed by the function call.

How do I apply a function to each row of a Dataframe in R?

You can use the apply() function to apply a function to each row in a matrix or data frame in R. where: X: Name of the matrix or data frame. MARGIN: Dimension to perform operation across.

Can you pass functions as parameters in R?

In R, you can pass a function as an argument. You can also pass function code to an argument. Then, you can assign the complete code of a function to a new object.


1 Answers

I had a similar problem, and looked in vain until I found it in Hadley's Advanced R. This allows you to pass on parameters as they appear in a dataframe, taking the names of columns as arguments. Read here:

https://adv-r.hadley.nz/functionals.html#pmap

So, here it is. There is a solution via purrr::pmap. It maps parameters onto a function:

from Hadley's Advanced R, 8.4.5

This is my own code which I recently used along with quanteda to mess around with the Kaggle SMS Spam dataset. These are the possibilities for my parameters:

tolower <- data_frame(tolower = c(TRUE, FALSE))
stem <- data_frame(stem = c(TRUE, FALSE))
remove_punct <- data_frame(remove_punct = c(TRUE, FALSE))

This is a bonus and not necessary, but I found I needed all of the combinations of my parameters to run a Naive Bayes model. Thanks to Y J via this SO post:

expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...))
parameters <- expand.grid.df(tolower, stem, remove_punct)

So, now my parameters look like this:

> parameters
  tolower  stem remove_punct
1    TRUE  TRUE         TRUE
2   FALSE  TRUE         TRUE
3    TRUE FALSE         TRUE
4   FALSE FALSE         TRUE
5    TRUE  TRUE        FALSE
6   FALSE  TRUE        FALSE
7    TRUE FALSE        FALSE
8   FALSE FALSE        FALSE

And now for the magic, passing the parameters on to my function of choice (dfm) via pmap:

mymodels <- pmap(parameters, dfm, x = mycorpus)

(x = mycorpus was an extra parameter that is constant, that I want to pass on to dfm)

Here's what I got:

> length(mymodels)
[1] 8
> mymodels[[1]]
Document-feature matrix of: 5,572 documents, 7,714 features (99.8% sparse).

Hope this helps you, or anyone else looking into this method!

like image 175
Marian Minar Avatar answered Oct 03 '22 01:10

Marian Minar