I have a data frame in R with the following structure.
> testData
date exch.code comm.code oi
1 1997-12-30 CBT 1 468710
2 1997-12-23 CBT 1 457165
3 1997-12-19 CBT 1 461520
4 1997-12-16 CBT 1 444190
5 1997-12-09 CBT 1 446190
6 1997-12-02 CBT 1 443085
....
77827 2004-10-26 NYME 967 10038
77828 2004-10-19 NYME 967 9910
77829 2004-10-12 NYME 967 10195
77830 2004-09-28 NYME 967 9970
77831 2004-08-31 NYME 967 9155
77832 2004-08-24 NYME 967 8655
What I want to do is produce a table the shows for a given date and commodity the total oi across every exchange code. So, the rows would be made up of
unique(testData$date)
and the columns would be
unique(testData$comm.code)
and each cell would be the total oi over all exch.codes on a given day.
Thanks,
The process involves two stages. First, collate individual cases of raw data together with a grouping variable. Second, perform which calculation you want on each group of cases.
aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum.
In order to use the aggregate function for mean in R, you will need to specify the numerical variable on the first argument, the categorical (as a list) on the second and the function to be applied (in this case mean ) on the third. An alternative is to specify a formula of the form: numerical ~ categorical .
Aggregate() Function in R Splits the data into subsets, computes summary statistics for each subsets and returns the result in a group by form. Aggregate function in R is similar to group by in SQL. Aggregate() function is useful in performing all the aggregate operations like sum,count,mean, minimum and Maximum.
The plyr package is good at this, and you should get this done with a single ddply()
call. Something like (untested)
ddply(testData, .(date,comm.code), function(x) sum(x$oi))
should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With