Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to aggregate this data in R

Tags:

r

aggregate

I have a data frame in R with the following structure.

> testData
            date exch.code comm.code     oi
1     1997-12-30       CBT         1 468710
2     1997-12-23       CBT         1 457165
3     1997-12-19       CBT         1 461520
4     1997-12-16       CBT         1 444190
5     1997-12-09       CBT         1 446190
6     1997-12-02       CBT         1 443085
....
    77827 2004-10-26      NYME       967  10038
    77828 2004-10-19      NYME       967   9910
    77829 2004-10-12      NYME       967  10195
    77830 2004-09-28      NYME       967   9970
    77831 2004-08-31      NYME       967   9155
    77832 2004-08-24      NYME       967   8655

What I want to do is produce a table the shows for a given date and commodity the total oi across every exchange code. So, the rows would be made up of

unique(testData$date)

and the columns would be

unique(testData$comm.code)

and each cell would be the total oi over all exch.codes on a given day.

Thanks,

like image 250
stevejb Avatar asked May 24 '10 20:05

stevejb


People also ask

How do you aggregate a dataset in R?

The process involves two stages. First, collate individual cases of raw data together with a grouping variable. Second, perform which calculation you want on each group of cases.

What does aggregate () in R do?

aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum.

How do I get aggregate value in R?

In order to use the aggregate function for mean in R, you will need to specify the numerical variable on the first argument, the categorical (as a list) on the second and the function to be applied (in this case mean ) on the third. An alternative is to specify a formula of the form: numerical ~ categorical .

What does it mean to aggregate data in R?

Aggregate() Function in R Splits the data into subsets, computes summary statistics for each subsets and returns the result in a group by form. Aggregate function in R is similar to group by in SQL. Aggregate() function is useful in performing all the aggregate operations like sum,count,mean, minimum and Maximum.


1 Answers

The plyr package is good at this, and you should get this done with a single ddply() call. Something like (untested)

ddply(testData, .(date,comm.code), function(x) sum(x$oi))

should work.

like image 101
Dirk Eddelbuettel Avatar answered Oct 14 '22 09:10

Dirk Eddelbuettel