Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use a character vector of column names in the formula argument of dcast (reshape2)

Tags:

r

reshape2

Say I have a dataframe df with dozens of identifying variables (in columns) and only a few measured variables (also in columns).

To avoid repetitively typing all the variables for each argument, I assign the names of the identifying and measured df columns to df_id and df_measured, respectively. It's easy enough to input these vectors to shorten the argument inputs for melt...

df.m  <- melt(df, id.vars = df_id, measure.vars = df_measured)

... but I'm at a loss for how to enter the formula = argument in dcast using the same method to specify my id variables since it requires that the input point to numeric positions of the columns.

Do I have to make a vector of numeric positions similar to df_id and risk broken functionality of my program if my input columns change in order, or can I refer to them by name and somehow still get that to work in the formula = argument? Thanks.

like image 283
mcjudd Avatar asked Dec 20 '22 06:12

mcjudd


1 Answers

You can use as.formula to construct a formula.

Here's an example:

library(reshape2)
## Example from `melt.data.frame`
names(airquality) <- tolower(names(airquality))
df_id <- c("month", "day")
aq <- melt(airquality, id = df_id)

## Constructing the formula
f <- as.formula(paste(paste(df_id, collapse = " + "), "~ variable"))

## Applying it....
dcast(aq, f, value.var = "value", fun.aggregate = mean)
like image 178
A5C1D2H2I1M1N2O1R2T1 Avatar answered May 01 '23 02:05

A5C1D2H2I1M1N2O1R2T1