I want to extract a data frame using a formula, which specifies which columns to select and some crossing overs among columns.
I know model.frame
function. However it does not give me the crossing overs:
For example:
df <- data.frame(x = c(1,2,3,4), y = c(2,3,4,7), z = c(5,6, 9, 1))
f <- formula('z~x*y')
model.frame(f, df)
output:
> df
x y z
1 1 2 5
2 2 3 6
3 3 4 9
4 4 7 1
> f <- formula('z~x*y')
> model.frame(f, df)
z x y
1 5 1 2
2 6 2 3
3 9 3 4
4 1 4 7
I hope to get:
z x y x*y
1 5 1 2 2
2 6 2 3 6
3 9 3 4 12
4 1 4 7 28
Is there a package that could achieve this functionality? (It would be perfect if I can get the resulting matrix as a sparse matrix because the crossed columns will be highly sparse)
You can use model.matrix
:
> model.matrix(f, df)
(Intercept) x y x:y
1 1 1 2 2
2 1 2 3 6
3 1 3 4 12
4 1 4 7 28
attr(,"assign")
[1] 0 1 2 3
If you want to save the result as a sparse matrix, you can use the Matrix
package:
> mat <- model.matrix(f, df)
> library(Matrix)
> Matrix(mat, sparse = TRUE)
4 x 4 sparse Matrix of class "dgCMatrix"
(Intercept) x y x:y
1 1 1 2 2
2 1 2 3 6
3 1 3 4 12
4 1 4 7 28
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With