tapply() function dependent on multiple columns in R

Tags:

summarization

In R, I have a table with Location, sample_year and count. So,

Location sample_year count  
A        1995        1
A        1995        1  
A        2000        3  
B        2000        1  
B        2000        1  
B        2000        5

I want a summary table that examines both the 'Location' and 'sample_year' columns and sums 'count' dependent on this unique combination instead of just a single column. So, end result should be:

Click to copy

Location sample_year sum_count
A        1995        2
A        2000        3
B        2000        7

I could merge columns and data into a new column to create unique a Location-sample_year but this is not a clean solution, esp if I need to scale it up to three columns at some point. There must be a better approach.

548

asked Mar 07 '11 05:03

DeLongTime

1 Answers

You can use aggregate with a formula.

First the data:

Click to copy

x <- read.table(textConnection("Location sample_year count  
A        1995        1
A        1995        1  
A        2000        3  
B        2000        1  
B        2000        1  
B        2000        5"), header = TRUE)

Aggregate using sum with a formula specifying the grouping:

Click to copy

aggregate(count ~ Location+sample_year, data = x, sum)
    Location sample_year count
1        A        1995     2
2        A        2000     3
3        B        2000     7

177

answered Sep 23 '22 00:09

mdsumner

Related questions
                            
                                Why doesn't restarting R with Ctrl-Shift-F10 clear my environment variables?
                            
                                Looping over multiple lists with base R
                            
                                Extract substring and numbers from a string in R
                            
                                Loop to add new columns with ifelse
                            
                                Logistic Regression on factor: Error in eval(family$initialize) : y values must be 0 <= y <= 1
                            
                                How to extend the 'summary' function to include sd, kurtosis and skew?
                            
                                Creating a waffle plot together with facets in ggplot2
                            
                                when trying to install rgeos R cannot find -lgeos
                            
                                plot circle segment defined by three points with ggplot2
                            
                                Recoding a semicolon separated list in R
                            
                                Using case_when with dplyr across
                            
                                Arrange data frame columns by class: numeric before character
                            
                                Using the dplyr library in R to "print" the name of the non-NA columns
                            
                                A regex to remove the pattern "[0-9]g"
                            
                                Large loops hang in R?
                            
                                Importing data from an XML file into R
                            
                                Non Linear Integer Programming
                            
                                How to make ggplot2 plots prettier?
                            
                                set environment variables for system() in R?
                            
                                SWeave with non-R code chunks?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

tapply() function dependent on multiple columns in R

Tags:

r

summarization

DeLongTime

People also ask

1 Answers

mdsumner

Recent Activity

Donate For Us