R colSums By Group

Tags:

plyr

In the following matrix dataset:

       1  2   3   4   5  
1950   7 20  21  15  61  
1951   2 10   6  26  57  
1952  12 27  43  37  34  
1953  14 16  40  47  94  
1954   2 17  62 113 101  
1955   3  4  43  99 148  
1956   2 47  31  85  79  
1957  17  5  38 216 228  
1958  11 20  15  76  68  
1959  16 20  43  30 226  
1960   9 28  28  70 201  
1961   1 31 124  74 137  
1962  12 25  37  41 200

I have been trying to calculate colSums by decade i.e., find sum the each column from 1950-1959 and then from 1960-69 and so on.

I tried tapply, ddply, etc but couldn't figure out something that would actually work.

692

asked Jan 31 '12 17:01

1 Answers

First we set up the matrix used as input.

Lines <- "1  2   3   4   5  
1950   7 20  21  15  61  
1951   2 10   6  26  57  
1952  12 27  43  37  34  
1953  14 16  40  47  94  
1954   2 17  62 113 101  
1955   3  4  43  99 148  
1956   2 47  31  85  79  
1957  17  5  38 216 228  
1958  11 20  15  76  68  
1959  16 20  43  30 226  
1960   9 28  28  70 201  
1961   1 31 124  74 137  
1962  12 25  37  41 200  "
DF <- read.table(text = Lines, check.names = FALSE)
m <- as.matrix(DF)

Now, below, we show some alternative solutions. (1) seems the most flexible in that we can easily replace sum with other functions to get different effects but (2) is the shortest for this particular problem. Also note that there are some slight differences. (1) produces a data.frame while the other two produce a matrix.

1) aggregate

decade <- 10 * as.numeric(rownames(m)) %/% 10
m.ag <- aggregate(m, data.frame(decade), sum)

which gives this data.frame:

> m.ag
  decade  1   2   3   4    5
1   1950 86 186 342 744 1096
2   1960 22  84 189 185  538

2) rowsum This one is shorter. It produces a matrix result.

rowsum(m, decade)

3) split/sapply. This one produces a matrix as well. if we had DF we could replace as.data.frame(m) with DF shortening it slightly.

t(sapply(split(as.data.frame(m), decade), colSums))

EDIT: added solutions (2) and (3) Added some clarifications.

135

answered Oct 20 '22 19:10

G. Grothendieck

Related questions
                            
                                Apply lm to subset of data frame defined by a third column of the frame
                            
                                Unable to format months with as.Date
                            
                                R - From Factor to Numeric or Integer error
                            
                                Conditional Sum in R
                            
                                Reading .dat and .dct directly from R
                            
                                Splitting a data.frame by a variable [duplicate]
                            
                                custom function after grouping data.fame
                            
                                Distance matrix to pairwise distance list in R
                            
                                how do you put text on different lines in ggplot
                            
                                R data.table group by multiple columns into 1 column and sum
                            
                                activate tabpanel from another tabpanel
                            
                                How do I get the shortest route in a labyrinth?
                            
                                Print "pretty" tables for h2o models in R
                            
                                Midpoint of discrete diverging scale in ggplot2
                            
                                Method for calculating distance between all points in a dataframe containing a list of xy coordinates
                            
                                How do I specify a dynamic position for the start of substring?
                            
                                How do I compute the number of occurrences of a particular value in a row in R
                            
                                subset data frame based on percentage
                            
                                Appending data in R
                            
                                Calculating wind direction from U and V components of the wind using lapply or ifelse

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R colSums By Group

Tags:

r

plyr

jitendra

People also ask

1 Answers

G. Grothendieck

Recent Activity

Donate For Us