Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the right hand side variables of an R formula

Tags:

class

r

formula

I'm writing my first S3 class and associated methods and I would like to know how to subset my input data set in order to keep only the variables specified in the formula?

data(iris)
f <- Species~Petal.Length + Petal.Width

With model.frame(f,iris) I get a subset with all the variables in the formula. How to automatically keep only the right hand side variables (in the example Petal.Length and Petal.Width)?

like image 571
WAF Avatar asked Jan 24 '14 10:01

WAF


4 Answers

You want labels and terms; see ?labels, ?terms, and ?terms.object.

labels(terms(f))
# [1] "Petal.Length" "Petal.Width" 

In particular, labels.terms returns the "term.labels" attribute of a terms object, which excludes the LHS variable.

like image 126
tonytonov Avatar answered Nov 16 '22 14:11

tonytonov


If you have a function in your formula, e.g., log, and want to subset the data frame based on the variables, you can use get_all_vars. This will ignore the function and extract the untransformed variables:

f2 <- Species ~ log(Petal.Length) + Petal.Width

get_all_vars(f2[-2], iris)

    Petal.Length Petal.Width
1            1.4         0.2
2            1.4         0.2
3            1.3         0.2
4            1.5         0.2
...

If you just want the variable names, all.vars is a very helpful function:

all.vars(f2[-2])

[1] "Petal.Length" "Petal.Width" 

The [-2] is used to exclude the left hand side.

like image 23
Sven Hohenstein Avatar answered Nov 16 '22 12:11

Sven Hohenstein


One way is to use subsetting to remove the LHS from the formula. Then you can use model.frame on this:

f[-2]
~Petal.Length + Petal.Width

model.frame(f[-2],iris)
    Petal.Length Petal.Width
1            1.4         0.2
2            1.4         0.2
3            1.3         0.2
4            1.5         0.2
5            1.4         0.2
6            1.7         0.4
...
like image 9
James Avatar answered Nov 16 '22 14:11

James


The package formula.tools has a number of functions to make life easier working with formulas. In your case:

> formula.tools::rhs.vars(f)
[1] "Petal.Length" "Petal.Width"

Relying on base R can be dangerous because the left hand side can be missing, meaning that element 1 no longer refers to that.

like image 6
CoderGuy123 Avatar answered Nov 16 '22 13:11

CoderGuy123