Say I have a dataframe df
with dozens of identifying variables (in columns) and only a few measured variables (also in columns).
To avoid repetitively typing all the variables for each argument, I assign the names of the identifying and measured df
columns to df_id
and df_measured
, respectively. It's easy enough to input these vectors to shorten the argument inputs for melt
...
df.m <- melt(df, id.vars = df_id, measure.vars = df_measured)
... but I'm at a loss for how to enter the formula =
argument in dcast
using the same method to specify my id variables since it requires that the input point to numeric positions of the columns.
Do I have to make a vector of numeric positions similar to df_id
and risk broken functionality of my program if my input columns change in order, or can I refer to them by name and somehow still get that to work in the formula =
argument? Thanks.
You can use as.formula
to construct a formula.
Here's an example:
library(reshape2)
## Example from `melt.data.frame`
names(airquality) <- tolower(names(airquality))
df_id <- c("month", "day")
aq <- melt(airquality, id = df_id)
## Constructing the formula
f <- as.formula(paste(paste(df_id, collapse = " + "), "~ variable"))
## Applying it....
dcast(aq, f, value.var = "value", fun.aggregate = mean)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With