Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Programming - Sum Elements of Rows with Common Values

Hello and thank you in advance for your assistance,

(PLEASE Note Comments section for additional insight: i.e. the cost column in the example below was added to this question; Simon, provides a great answer, but the cost column itself is not represented in the data response from him, although the function he provides works with the cost column)

I have a data set, lets call it 'data' which looks like this

NAME     DATE     COLOR   PAID    COST
Jim      1/1/2013 GREEN   150     100
Jim      1/2/2013 GREEN   50      25
Joe      1/1/2013 GREEN   200     150
Joe      1/2/2013 GREEN   25      10

What I would like to do is sum the PAID (and COST) elements of the records with the same NAME value and reduce the number of rows (as in this example) to 2, such that my new data frame looks like this:

NAME     DATE     COLOR   PAID    COST
Jim      1/2/2013 GREEN   200     125
Joe      1/2/2013 GREEN   225     160

As far as the dates are concerned, I don't really care about which one survives the summation process.

I've gotten as far as rowSums(data), but I'm not exactly certain how to use it. Any help would be greatly appreciated.

like image 738
Jonathan Charlton Avatar asked Apr 10 '13 18:04

Jonathan Charlton


People also ask

How do I sum rows together in R?

Syntax: mutate(new-col-name = rowSums(.)) The rowSums() method is used to calculate the sum of each row and then append the value at the end of each row under the new column name specified. The argument . is used to apply the function over all the cells of the data frame. Syntax: rowSums(.)

How do I sum different groups in R?

How to do group by sum in R? By using aggregate() from R base or group_by() function along with the summarise() from the dplyr package you can do the group by on dataframe on a specific column and get the sum of a column for each group.

How do I find rows with certain values in R?

You can use the following basic syntax to find the rows of a data frame in R in which a certain value appears in any of the columns: library(dplyr) df %>% filter_all(any_vars(. %in% c('value1', 'value2', ...)))


1 Answers

aggregate is the function you are looking for:

aggregate( cbind( PAID , COST ) ~ NAME + COLOR , data = data , FUN = sum )
# NAME PAID
# 1  Jim  200
# 2  Joe  225
like image 199
Simon O'Hanlon Avatar answered Oct 19 '22 06:10

Simon O'Hanlon