Why can't one have multiple variables passed to value.var
in dcast
? From ?dcast
:
value.var name of column which stores values, see guess_value for default strategies to figure this out.
It doesn't explicitly indicate that only one single variable can be passed on as value. If however I try that, then I get an error:
> library("reshape2")
> library("MASS")
>
> dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
the condition has length > 1 and only the first element will be used
So is there a good reason for imposing this limitation? And is it possible to work around this (perhaps using reshape
, etc.)?
This question is very much related to your other question from earlier today.
@beginneR wrote in the comments that "As long as the existing data is already in long-format, I don't see any general need to melt it before casting." In my answer posted at your other question, I gave an example of when melt
would be required, or rather, how to decide whether your data are long enough.
This question here is another example of when further melt
ing would be required since point 3 in my answer is not satisfied.
To get the behavior you want, try the following:
C93L <- melt(Cars93, measure.vars = c("Price", "Weight"))
dcast(C93L, AirBags ~ DriveTrain + variable, mean, value.var = "value")
# AirBags 4WD_Price 4WD_Weight Front_Price Front_Weight
# 1 Driver & Passenger NaN NaN 26.17273 3393.636
# 2 Driver only 21.38 3623 18.69286 2996.250
# 3 None 13.88 2987 12.98571 2703.036
# Rear_Price Rear_Weight
# 1 33.20 3515.0
# 2 28.23 3463.5
# 3 14.90 3610.0
An alternative is to use aggregate
to calculate the mean
s, and then use reshape
or dcast
to go from "long" to "wide". Both are required since reshape
does not perform any aggregation:
temp <- aggregate(cbind(Price, Weight) ~ AirBags + DriveTrain,
Cars93, mean)
# AirBags DriveTrain Price Weight
# 1 Driver only 4WD 21.38000 3623.000
# 2 None 4WD 13.88000 2987.000
# 3 Driver & Passenger Front 26.17273 3393.636
# 4 Driver only Front 18.69286 2996.250
# 5 None Front 12.98571 2703.036
# 6 Driver & Passenger Rear 33.20000 3515.000
# 7 Driver only Rear 28.23000 3463.500
# 8 None Rear 14.90000 3610.000
reshape(temp, direction = "wide",
idvar = "AirBags", timevar = "DriveTrain")
# AirBags Price.4WD Weight.4WD Price.Front Weight.Front
# 1 Driver only 21.38 3623 18.69286 2996.250
# 2 None 13.88 2987 12.98571 2703.036
# 3 Driver & Passenger NA NA 26.17273 3393.636
# Price.Rear Weight.Rear
# 1 28.23 3463.5
# 2 14.90 3610.0
# 3 33.20 3515.0
I had the same issue and I found this answer: Error using dcast with multiple value.var that suggests to "force" data.table dcast function as follows:
# multiple value.var
data.table::dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))
I was able to cast multiple variables without error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With