I have been trying to run the calculate <code>rowMeans</code> within <code>dplyr</code>'s <code>mutate</code> function, but keep getting errors. Below is an example DATA set and desired RESULT. <pre class="prettyprint"><code>DATA = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), DATE = c("1","1","2","2","3","3","3","4","4"), STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000), STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000)) RESULT = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), DATE = c("1","1","2","2","3","3","3","4","4"), STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000), STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000), NAYSA = c(1.5, 3, 45, 60, 150, 300, 450, 7500, 9000)) </code></pre> The code I have written begins by randomly sampling <code>STUFF</code> and <code>STUFF2</code>. Then I would like to calculate the <code>rowMeans</code> of <code>STUFF</code> and <code>STUFF2</code> and export the result to a new column. I could accomplish this task using <code>tidyr</code>, but would have to redo a larger number of variables. Furthermore I could use the R base package, but prefer to find a solution using the <code>mutate</code> function in <code>dplyr</code>. Thanks in advance. <pre class="prettyprint"><code>RESULT = group_by(DATA, SITE, DATE) %>% mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>% # These approaches return errors mutate(NAYSA = rowMeans(DATA[,-1:-2])) mutate(NAYSA = rowMeans(.[,-1:-2])) mutate (NAYSE = rowMeans(.)) </code></pre>

You need the <code>rowwise</code> function in <code>dplyr</code> to do that. Your data is random (because of sample) so it produces different results but you will see that it works: <pre class="prettyprint"><code>library(dplyr) group_by(DATA, SITE, DATE) %>% mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>% rowwise() %>% mutate(NAYSA = mean(c(STUFF,STUFF2))) </code></pre> Output: <pre class="prettyprint"><code>Source: local data frame [9 x 5] Groups: <by row> SITE DATE STUFF STUFF2 NAYSA 1 A 1 1 2 1.5 2 A 1 2 2 2.0 3 A 2 30 80 55.0 4 A 2 30 60 45.0 5 B 3 200 600 400.0 6 B 3 300 200 250.0 7 B 3 100 600 350.0 8 C 4 5000 12000 8500.0 9 C 4 6000 10000 8000.0 </code></pre> As you see it calculates the rowwise mean per row, according to STUFF and STUFF2

@GregF Yep....<code>ungroup()</code> was the key. Thanks. Working code <pre class="prettyprint"><code>RESULT = group_by(DATA, SITE, DATE) %>% mutate(STUFF = sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE)) %>% ungroup() %>% mutate(NAYSA = rowMeans(.[,-1:-2])) </code></pre>

rowMeans function in dplyr

Tags:

r

dplyr

I have been trying to run the calculate rowMeans within dplyr's mutate function, but keep getting errors. Below is an example DATA set and desired RESULT.

DATA = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), 
                  DATE = c("1","1","2","2","3","3","3","4","4"), 
                  STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000),
                  STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000))

RESULT = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), 
                    DATE = c("1","1","2","2","3","3","3","4","4"), 
                    STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000),
                    STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000),
                    NAYSA = c(1.5, 3, 45, 60, 150, 300, 450, 7500, 9000))

The code I have written begins by randomly sampling STUFF and STUFF2. Then I would like to calculate the rowMeans of STUFF and STUFF2 and export the result to a new column. I could accomplish this task using tidyr, but would have to redo a larger number of variables. Furthermore I could use the R base package, but prefer to find a solution using the mutate function in dplyr. Thanks in advance.

RESULT = group_by(DATA, SITE, DATE) %>%
  mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>%
  # These approaches return errors 
  mutate(NAYSA = rowMeans(DATA[,-1:-2]))
  mutate(NAYSA = rowMeans(.[,-1:-2])) 
  mutate (NAYSE = rowMeans(.))

589

asked Mar 16 '15 17:03

Vesuccio

2 Answers

You need the rowwise function in dplyr to do that. Your data is random (because of sample) so it produces different results but you will see that it works:

library(dplyr)
  group_by(DATA, SITE, DATE) %>%
  mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>%
  rowwise() %>%
  mutate(NAYSA = mean(c(STUFF,STUFF2)))

Output:

Source: local data frame [9 x 5]
Groups: <by row>

  SITE DATE STUFF STUFF2  NAYSA
1    A    1     1      2    1.5
2    A    1     2      2    2.0
3    A    2    30     80   55.0
4    A    2    30     60   45.0
5    B    3   200    600  400.0
6    B    3   300    200  250.0
7    B    3   100    600  350.0
8    C    4  5000  12000 8500.0
9    C    4  6000  10000 8000.0

As you see it calculates the rowwise mean per row, according to STUFF and STUFF2

158

answered Oct 12 '22 21:10

LyzandeR

@GregF Yep....ungroup() was the key. Thanks.

Working code

RESULT = group_by(DATA, SITE, DATE) %>% 
  mutate(STUFF = sample(STUFF,replace= TRUE), 
         STUFF2 = sample(STUFF2,replace= TRUE)) %>% 
  ungroup() %>% 
  mutate(NAYSA = rowMeans(.[,-1:-2]))

answered Oct 12 '22 20:10

Vesuccio

Related questions
                            
                                Pretty print sql code from separate file with knitr
                            
                                Create bookmarks into a PDF file via command line
                            
                                "read_excel" in a Shiny app
                            
                                data.table in R - multiple filters using multiple keys - binary search
                            
                                How to print text and variables in a single line in r
                            
                                Match/group duplicate rows (indices)
                            
                                RStudio gives "Incorrect function" when setting git as Version control
                            
                                Embed Rmarkdown with Rmarkdown, without knitr evaluation
                            
                                dplyr count number of one specific value of variable
                            
                                dplyr::n() returns "Error: This function should not be called directly"
                            
                                Efficient calculation of var-covar matrix in R
                            
                                How to change the font of the main title in plot()
                            
                                Plotting google map with ggplot in R
                            
                                R: numeric vector becoming non-numeric after cbind of dates
                            
                                plots generated by 'plot' and 'ggplot' side-by-side
                            
                                strptime, as.POSIXct and as.Date return unexpected NA
                            
                                Reshape wide format, to multi-column long format
                            
                                as.Date(as.POSIXct()) gives the wrong date?
                            
                                How to round a time?
                            
                                How can I avoid having my R script printed every time I run it?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

rowMeans function in dplyr

Tags:

r

dplyr

Vesuccio

People also ask

2 Answers

LyzandeR

Vesuccio

Recent Activity

Donate For Us