I'm kind-of used to do melt
and cast
all the time, and this time I'm looking for neat one-liner.
require(reshape)
# first I melt some data:
m <- melt(mtcars, id.vars = c("cyl", "am"), measure.vars = "hp")
# then cast it:
cast(m, cyl + am ~ ., each(min, mean, sd, max))
cyl am min mean sd max
1 4 0 62 84.66667 19.65536 97
2 4 1 52 81.87500 22.65542 113
3 6 0 105 115.25000 9.17878 123
4 6 1 110 131.66667 37.52777 175
5 8 0 150 194.16667 33.35984 245
6 8 1 264 299.50000 50.20458 335
Is this possible with ddply
or smth? I'm desperate for a one-liner. I tried:
ddply(mtcars, cyl + am ~ hp, each(min, max))
cyl am hp min max
1 4 0 62 0 146.7
2 4 0 95 0 140.8
3 4 0 97 0 120.1
4 4 1 52 1 75.7
5 4 1 65 1 71.1
6 4 1 66 1 79.0
7 4 1 91 0 120.3
8 4 1 93 1 108.0
9 4 1 109 1 121.0
10 4 1 113 1 113.0
11 6 0 105 0 225.0
12 6 0 110 0 258.0
13 6 0 123 0 167.6
14 6 1 110 0 160.0
15 6 1 175 0 175.0
16 8 0 150 0 318.0
17 8 0 175 0 400.0
18 8 0 180 0 275.8
19 8 0 205 0 472.0
20 8 0 215 0 460.0
21 8 0 230 0 440.0
22 8 0 245 0 360.0
23 8 1 264 0 351.0
24 8 1 335 0 335.0
of course, this works, but not by summarising hp
by cyl
and am
. It's bit a while since I used plyr
and reshape
so I kind-of lost my muscle... so... excuse-moi pour une question triviale... =/
The melt() function in R programming is an in-built function. It enables us to reshape and elongate the data frames in a user-defined manner. It organizes the data values in a long data frame format.
The melt() function is used to convert a data frame with several measurement columns into a data frame in this canonical format, which has one row for every observed (measured) value.
Data Reshaping in R is something like arranged rows and columns in your own way to use it as per your requirements, mostly data is taken as a data frame format in R to do data processing using functions like 'rbind()', 'cbind()', etc. In this process, you reshape or re-organize the data into rows and columns.
summmarize
may be your friend:
ddply(m, c("cyl", "am"), summarize
, min = min(value)
, mean = mean(value)
, sd = sd(value)
, max = max(value)
)
cyl am min mean sd max
1 4 0 62 84.66667 19.65536 97
2 4 1 52 81.87500 22.65542 113
3 6 0 105 115.25000 9.17878 123
4 6 1 110 131.66667 37.52777 175
5 8 0 150 194.16667 33.35984 245
6 8 1 264 299.50000 50.20458 335
Using plyr
:
> require(plyr)
> ddply(mtcars,c("cyl","am"),summarise, min=min(hp), mean=mean(hp), sd=sd(hp), max=max(hp))
cyl am min mean sd max
1 4 0 62 84.66667 19.65536 97
2 4 1 52 81.87500 22.65542 113
3 6 0 105 115.25000 9.17878 123
4 6 1 110 131.66667 37.52777 175
5 8 0 150 194.16667 33.35984 245
6 8 1 264 299.50000 50.20458 335
> ddply(mtcars, .(cyl, am), summarise,
min=min(hp), mean=mean(hp), sd=sd(hp), max=max(hp))
cyl am min mean sd max
1 4 0 62 84.66667 19.65536 97
2 4 1 52 81.87500 22.65542 113
3 6 0 105 115.25000 9.17878 123
4 6 1 110 131.66667 37.52777 175
5 8 0 150 194.16667 33.35984 245
6 8 1 264 299.50000 50.20458 335
I'm not sure how to avoid having to name each function twice, though...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With