I have a <code>data.frame</code> that looks like this: <pre class="prettyprint lang-r prettyprint-override"><code># set example data df <- read.table(textConnection("item\tsize\tweight\tvalue A\t2\t3\t4 A\t2\t3\t6 B\t1\t2\t3 C\t3\t2\t1 B\t1\t2\t4 B\t1\t2\t2"), header = TRUE) # print example data df </code></pre> <pre class="prettyprint lang-none prettyprint-override"><code> item size weight value 1 A 2 3 4 2 A 2 3 6 3 B 1 2 3 4 C 3 2 1 5 B 1 2 4 6 B 1 2 2 </code></pre> As you can see the <code>size</code> and <code>weight</code> columns do not add any complexity since they are the same for each <code>item</code>. However, there can be multiple <code>value</code>s for the same <code>item</code>. I want to collapse the data.frame to have one row per <code>item</code> using the mean <code>value</code>: <pre class="prettyprint lang-none prettyprint-override"><code> item size weight value 1 A 2 3 5 3 B 1 2 3 4 C 3 2 1 </code></pre> I guess I have to use the <code>aggregate</code> function but I could not figure out how exactly I can get the above result.

Nowadays, this is what I would do: <pre class="prettyprint lang-r prettyprint-override"><code>library(dplyr) df %>% group_by(item, size, weight) %>% summarize(value = mean(value)) %>% ungroup </code></pre> This yields the following result: <pre class="prettyprint lang-none prettyprint-override"><code># A tibble: 3 x 4 item size weight value <chr> <int> <int> <dbl> 1 A 2 3 5 2 B 1 2 3 3 C 3 2 1 </code></pre> I will leave the accepted answer as such as I specifically asked for <code>aggregate</code>, but I find the <code>dplyr</code> solution the most readable.

Here is the solution using the <code>ddply</code> from plyr package: <pre class="prettyprint"><code>library(plyr) ddply(df,.(item),colwise(mean)) item size weight value 1 A 2 3 5 2 B 1 2 3 3 C 3 2 1 </code></pre>

Aggregate multiple rows of the same data.frame in R based on common values in given columns

Tags:

dataframe

r

aggregate

I have a data.frame that looks like this:

# set example data
df <- read.table(textConnection("item\tsize\tweight\tvalue
A\t2\t3\t4
A\t2\t3\t6
B\t1\t2\t3
C\t3\t2\t1
B\t1\t2\t4
B\t1\t2\t2"), header = TRUE)

# print example data
df

  item size weight value
1    A    2      3     4
2    A    2      3     6
3    B    1      2     3
4    C    3      2     1
5    B    1      2     4
6    B    1      2     2

As you can see the size and weight columns do not add any complexity since they are the same for each item. However, there can be multiple values for the same item.

I want to collapse the data.frame to have one row per item using the mean value:

  item size weight value
1    A    2      3     5
3    B    1      2     3
4    C    3      2     1

I guess I have to use the aggregate function but I could not figure out how exactly I can get the above result.

344

asked Aug 14 '13 09:08

mschilli

4 Answers

Nowadays, this is what I would do:

library(dplyr)

df %>%
  group_by(item, size, weight) %>%
  summarize(value = mean(value)) %>%
  ungroup

This yields the following result:

# A tibble: 3 x 4
   item  size weight value
  <chr> <int>  <int> <dbl>
1     A     2      3     5
2     B     1      2     3
3     C     3      2     1

I will leave the accepted answer as such as I specifically asked for aggregate, but I find the dplyr solution the most readable.

answered Oct 14 '22 14:10

mschilli

Here is the solution using the ddply from plyr package:

library(plyr)
ddply(df,.(item),colwise(mean))
  item size weight value
1    A    2      3     5
2    B    1      2     3
3    C    3      2     1

answered Oct 18 '22 21:10

Metrics

aggregate(value ~ item + size + weight, FUN = mean, data=df)

  item size weight value
1    B    1      2     3
2    C    3      2     1
3    A    2      3     5

answered Oct 18 '22 22:10

Mark Miller

df$value <- ave(df$value,df$item,FUN=mean)
df[!duplicated(df$item),]

  item size weight value
1    A    2      3     5
3    B    1      2     3
4    C    3      2     1

answered Oct 18 '22 20:10

Thomas

Related questions
                            
                                Vector vs. Data frame in R
                            
                                Faster way to split a string and count characters using R?
                            
                                Replace nth line in a text file
                            
                                Converting from a character to a numeric data frame
                            
                                Fastest way for multiplying a matrix to a vector
                            
                                Install the package that has been removed from the CRAN repository easily
                            
                                Can ggplot theme formatting be saved as an object?
                            
                                R: reducing colour saturation of a colour palette
                            
                                How do I run a function every second
                            
                                RStudio installation failure under Debian sid: libgstreamer dependency problems
                            
                                How to force specific order of the variables on the X axis?
                            
                                How do I get RSS from a linear model output
                            
                                R's grepl() to find multiple strings exists [duplicate]
                            
                                Equivalent of "table" of R in python
                            
                                Within the base packages, how can I generate the unique unordered pairs between two copies of a vector?
                            
                                Remove objects in .GlobalEnv from within a function
                            
                                Order categorical data in a stacked bar plot with ggplot2
                            
                                Figure out what version of R a function was introduced in
                            
                                how to bind the same vector multiple times?
                            
                                Reshape long structured data.table into a wide structure using data.table functionality?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Aggregate multiple rows of the same data.frame in R based on common values in given columns

Tags:

dataframe

r

aggregate

mschilli

People also ask

4 Answers

mschilli

Metrics

Mark Miller

Thomas

Recent Activity

Donate For Us