Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mend reshape-based habits with plyr: melt/cast vs. ddply

I'm kind-of used to do melt and cast all the time, and this time I'm looking for neat one-liner.

require(reshape)
# first I melt some data:
m <- melt(mtcars, id.vars = c("cyl", "am"), measure.vars = "hp")
# then cast it:
cast(m, cyl + am ~ ., each(min, mean, sd, max))
  cyl am min      mean       sd max
1   4  0  62  84.66667 19.65536  97
2   4  1  52  81.87500 22.65542 113
3   6  0 105 115.25000  9.17878 123
4   6  1 110 131.66667 37.52777 175
5   8  0 150 194.16667 33.35984 245
6   8  1 264 299.50000 50.20458 335

Is this possible with ddply or smth? I'm desperate for a one-liner. I tried:

ddply(mtcars, cyl + am ~ hp, each(min, max))
   cyl am  hp min   max
1    4  0  62   0 146.7
2    4  0  95   0 140.8
3    4  0  97   0 120.1
4    4  1  52   1  75.7
5    4  1  65   1  71.1
6    4  1  66   1  79.0
7    4  1  91   0 120.3
8    4  1  93   1 108.0
9    4  1 109   1 121.0
10   4  1 113   1 113.0
11   6  0 105   0 225.0
12   6  0 110   0 258.0
13   6  0 123   0 167.6
14   6  1 110   0 160.0
15   6  1 175   0 175.0
16   8  0 150   0 318.0
17   8  0 175   0 400.0
18   8  0 180   0 275.8
19   8  0 205   0 472.0
20   8  0 215   0 460.0
21   8  0 230   0 440.0
22   8  0 245   0 360.0
23   8  1 264   0 351.0
24   8  1 335   0 335.0

of course, this works, but not by summarising hp by cyl and am. It's bit a while since I used plyr and reshape so I kind-of lost my muscle... so... excuse-moi pour une question triviale... =/

like image 646
aL3xa Avatar asked Oct 10 '11 19:10

aL3xa


People also ask

What does melt () do in R?

The melt() function in R programming is an in-built function. It enables us to reshape and elongate the data frames in a user-defined manner. It organizes the data values in a long data frame format.

What does it mean to melt data?

The melt() function is used to convert a data frame with several measurement columns into a data frame in this canonical format, which has one row for every observed (measured) value.

What is reshaping of data in R explain with example?

Data Reshaping in R is something like arranged rows and columns in your own way to use it as per your requirements, mostly data is taken as a data frame format in R to do data processing using functions like 'rbind()', 'cbind()', etc. In this process, you reshape or re-organize the data into rows and columns.


3 Answers

summmarize may be your friend:

ddply(m, c("cyl", "am"), summarize
      , min = min(value)
      , mean = mean(value)
      , sd = sd(value)
      , max = max(value)
)

  cyl am min      mean       sd max
1   4  0  62  84.66667 19.65536  97
2   4  1  52  81.87500 22.65542 113
3   6  0 105 115.25000  9.17878 123
4   6  1 110 131.66667 37.52777 175
5   8  0 150 194.16667 33.35984 245
6   8  1 264 299.50000 50.20458 335
like image 184
Chase Avatar answered Nov 15 '22 07:11

Chase


Using plyr:

> require(plyr)
> ddply(mtcars,c("cyl","am"),summarise, min=min(hp), mean=mean(hp), sd=sd(hp), max=max(hp))
  cyl am min      mean       sd max
1   4  0  62  84.66667 19.65536  97
2   4  1  52  81.87500 22.65542 113
3   6  0 105 115.25000  9.17878 123
4   6  1 110 131.66667 37.52777 175
5   8  0 150 194.16667 33.35984 245
6   8  1 264 299.50000 50.20458 335
like image 32
ROLO Avatar answered Nov 15 '22 08:11

ROLO


> ddply(mtcars, .(cyl, am), summarise, 
        min=min(hp), mean=mean(hp), sd=sd(hp), max=max(hp))
  cyl am min      mean       sd max
1   4  0  62  84.66667 19.65536  97
2   4  1  52  81.87500 22.65542 113
3   6  0 105 115.25000  9.17878 123
4   6  1 110 131.66667 37.52777 175
5   8  0 150 194.16667 33.35984 245
6   8  1 264 299.50000 50.20458 335

I'm not sure how to avoid having to name each function twice, though...

like image 30
Harlan Avatar answered Nov 15 '22 09:11

Harlan