Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract a data frame using model.frame and formula

Tags:

r

I want to extract a data frame using a formula, which specifies which columns to select and some crossing overs among columns.

I know model.frame function. However it does not give me the crossing overs:

For example:

df <- data.frame(x = c(1,2,3,4), y = c(2,3,4,7), z = c(5,6, 9, 1))
f <- formula('z~x*y')
model.frame(f, df)

output:

> df
  x y z
1 1 2 5
2 2 3 6
3 3 4 9
4 4 7 1
> f <- formula('z~x*y')
> model.frame(f, df)
  z x y
1 5 1 2
2 6 2 3
3 9 3 4
4 1 4 7

I hope to get:

  z x y x*y
1 5 1 2 2
2 6 2 3 6
3 9 3 4 12
4 1 4 7 28

Is there a package that could achieve this functionality? (It would be perfect if I can get the resulting matrix as a sparse matrix because the crossed columns will be highly sparse)

like image 244
Yin Zhu Avatar asked Oct 06 '14 17:10

Yin Zhu


1 Answers

You can use model.matrix:

> model.matrix(f, df)
  (Intercept) x y x:y
1           1 1 2   2
2           1 2 3   6
3           1 3 4  12
4           1 4 7  28
attr(,"assign")
[1] 0 1 2 3

If you want to save the result as a sparse matrix, you can use the Matrix package:

> mat <- model.matrix(f, df)
> library(Matrix)
> Matrix(mat, sparse = TRUE)
4 x 4 sparse Matrix of class "dgCMatrix"
  (Intercept) x y x:y
1           1 1 2   2
2           1 2 3   6
3           1 3 4  12
4           1 4 7  28
like image 146
Sven Hohenstein Avatar answered Oct 27 '22 18:10

Sven Hohenstein