I'll apologise in advance - I know this has likely been answered elsewhere, but I don't seem to be able to find the answer I need, and can't manage to adapt other code I have found to my needs.
I have a data frame:
FILE | TECHNIQUE | COUNT
------------------------
A | ONE | 10
A | TWO | 25
B | ONE | 5
B | TWO | 30
C | ONE | 30
C | TWO | 50
I would like to produce a data frame of the difference of the COUNT values between ONE and TWO, with a row for each FILE, i.e.
FILE | DIFFERENCE
-----------------
A | 15
B | 25
C | 20
I'm convinced I should be able to do this fairly easily with base R or Plyr, but am a bit stuck. Could anyone suggest a good way to do this, and perhaps good tutorials on Plyr that might help me with similar problems in the future?
Thanks
Using aggregate
in base:
> aggregate(.~FILE, data= DF[, -2], FUN=diff)
FILE COUNT
1 A 15
2 B 25
3 C 20
Using ddply
in plyr
> ddply(DF[,-2], .(FILE), summarize, DIFFERENCE=diff(COUNT))
FILE DIFFERENCE
1 A 15
2 B 25
3 C 20
with data.table
> # library(data.table)
> DT <- data.table(DF)
> DT[, diff(COUNT), by=FILE]
FILE V1
1: A 15
2: B 25
3: C 20
with by
> with(DF, by(COUNT, FILE, diff))
FILE: A
[1] 15
-----------------------------------------------------------------------------
FILE: B
[1] 25
-----------------------------------------------------------------------------
FILE: C
[1] 20
with tapply
> tapply(DF$COUNT, DF$FILE, diff)
A B C
15 25 20
with summaryBy
from doBy package
> # library(doBy)
> summaryBy(COUNT~FILE, FUN=diff, data=DF)
FILE COUNT.diff
1 A 15
2 B 25
3 C 20
Update As percentage:
> aggregate(.~FILE, data= DF[, -2], function(x) (x[1]/x[2])*100)
FILE COUNT
1 A 40.00000
2 B 16.66667
3 C 60.00000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With