This is somehow related to this question: In principle I try to understand how <code>rowwise</code> operations with <code>mutate</code> across multiple columns applying more then 1 functions like (<code>mean()</code>, <code>sum()</code>, <code>min()</code> etc..) work. I have learned that <code>across</code> does this job and not <code>c_across</code>. I have learned that the function <code>mean()</code> is different to the function <code>min()</code> in that way that <code>mean()</code> doesn't work on dataframes and we need to change it to vector which can be done with unlist or as.matrix -> learned from Ronak Shah hereUnderstanding rowwise() and c_across() Now with my actual case: I was able to do this task but I loose one column <code>d</code>. How can I avoid the loose of the column <code>d</code> in this setting. My df: <pre class="prettyprint"><code>df <- structure(list(a = 1:5, b = 6:10, c = 11:15, d = c("a", "b", "c", "d", "e"), e = 1:5), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame")) </code></pre> Works not: <pre class="prettyprint"><code>df %>% rowwise() %>% mutate(across(a:e), avg = mean(unlist(cur_data()), na.rm = TRUE), min = min(unlist(cur_data()), na.rm = TRUE), max = max(unlist(cur_data()), na.rm = TRUE) ) # Output: a b c d e avg min max <int> <int> <int> <chr> <int> <dbl> <chr> <chr> 1 1 6 11 a 1 NA 1 a 2 2 7 12 b 2 NA 12 b 3 3 8 13 c 3 NA 13 c 4 4 9 14 d 4 NA 14 d 5 5 10 15 e 5 NA 10 e </code></pre> Works, but I loose column <code>d</code>: <pre class="prettyprint"><code>df %>% select(-d) %>% rowwise() %>% mutate(across(a:e), avg = mean(unlist(cur_data()), na.rm = TRUE), min = min(unlist(cur_data()), na.rm = TRUE), max = max(unlist(cur_data()), na.rm = TRUE) ) a b c e avg min max <int> <int> <int> <int> <dbl> <dbl> <dbl> 1 1 6 11 1 4.75 1 11 2 2 7 12 2 5.75 2 12 3 3 8 13 3 6.75 3 13 4 4 9 14 4 7.75 4 14 5 5 10 15 5 8.75 5 15 </code></pre>

Using <code>pmap()</code> from <code>purrr</code> might be more preferable since you need to select the data just once and you can use the select helpers: <pre class="prettyprint"><code>df %>% mutate(pmap_dfr(across(where(is.numeric)), ~ data.frame(max = max(c(...)), min = min(c(...)), avg = mean(c(...))))) a b c d e max min avg <int> <int> <int> <chr> <int> <int> <int> <dbl> 1 1 6 11 a 1 11 1 4.75 2 2 7 12 b 2 12 2 5.75 3 3 8 13 c 3 13 3 6.75 4 4 9 14 d 4 14 4 7.75 5 5 10 15 e 5 15 5 8.75 </code></pre> Or with the addition of <code>tidyr</code>: <pre class="prettyprint"><code>df %>% mutate(res = pmap(across(where(is.numeric)), ~ list(max = max(c(...)), min = min(c(...)), avg = mean(c(...))))) %>% unnest_wider(res) </code></pre>

Here is one method which would preserve the <code>data.frame</code> attribute in <code>mutate</code> if we want to set a particular column to row name attribute (<code>column_to_rownames</code>) and then return the attribute after the transformation <pre class="prettyprint"><code>library(dplyr) library(tibble) library(purrr) df %>% column_to_rownames('d') %>% mutate(max = reduce(., pmax), min = reduce(., pmin), avg = rowMeans(.)) %>% rownames_to_column('d') # d a b c e max min avg #1 a 1 6 11 1 11 1 4.75 #2 b 2 7 12 2 12 2 5.75 #3 c 3 8 13 3 13 3 6.75 #4 d 4 9 14 4 14 4 7.75 #5 e 5 10 15 5 15 5 8.75 </code></pre>

Combine: rowwise(), mutate(), across(), for multiple functions

Tags:

r

dplyr

across

rowwise

This is somehow related to this question: In principle I try to understand how rowwise operations with mutate across multiple columns applying more then 1 functions like (mean(), sum(), min() etc..) work.

I have learned that across does this job and not c_across. I have learned that the function mean() is different to the function min() in that way that mean() doesn't work on dataframes and we need to change it to vector which can be done with unlist or as.matrix -> learned from Ronak Shah hereUnderstanding rowwise() and c_across()

Now with my actual case: I was able to do this task but I loose one column d. How can I avoid the loose of the column d in this setting.

My df:

Click to copy

df <- structure(list(a = 1:5, b = 6:10, c = 11:15, d = c("a", "b", 
"c", "d", "e"), e = 1:5), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

Works not:

Click to copy

df %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         avg = mean(unlist(cur_data()), na.rm = TRUE),
         min = min(unlist(cur_data()), na.rm = TRUE), 
         max = max(unlist(cur_data()), na.rm = TRUE)
  )

# Output:
      a     b     c d         e   avg min   max  
  <int> <int> <int> <chr> <int> <dbl> <chr> <chr>
1     1     6    11 a         1    NA 1     a    
2     2     7    12 b         2    NA 12    b    
3     3     8    13 c         3    NA 13    c    
4     4     9    14 d         4    NA 14    d    
5     5    10    15 e         5    NA 10    e

Works, but I loose column d:

Click to copy

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         avg = mean(unlist(cur_data()), na.rm = TRUE),
         min = min(unlist(cur_data()), na.rm = TRUE), 
         max = max(unlist(cur_data()), na.rm = TRUE)
  )

      a     b     c     e   avg   min   max
  <int> <int> <int> <int> <dbl> <dbl> <dbl>
1     1     6    11     1  4.75     1    11
2     2     7    12     2  5.75     2    12
3     3     8    13     3  6.75     3    13
4     4     9    14     4  7.75     4    14
5     5    10    15     5  8.75     5    15

667

asked May 01 '21 14:05

TarJae

Video Answer

3 Answers

Using pmap() from purrr might be more preferable since you need to select the data just once and you can use the select helpers:

Click to copy

df %>% 
 mutate(pmap_dfr(across(where(is.numeric)),
                 ~ data.frame(max = max(c(...)),
                              min = min(c(...)),
                              avg = mean(c(...)))))

      a     b     c d         e   max   min   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1    11     1  4.75
2     2     7    12 b         2    12     2  5.75
3     3     8    13 c         3    13     3  6.75
4     4     9    14 d         4    14     4  7.75
5     5    10    15 e         5    15     5  8.75

Or with the addition of tidyr:

Click to copy

df %>% 
 mutate(res = pmap(across(where(is.numeric)),
                   ~ list(max = max(c(...)),
                          min = min(c(...)),
                          avg = mean(c(...))))) %>%
 unnest_wider(res)

166

answered Oct 18 '22 04:10

tmfmnk

Edit:

Best way out here

Click to copy

df %>%
  rowwise() %>% 
  mutate(min = min(c_across(a:e & where(is.numeric)), na.rm = TRUE),
         max = max(c_across(a:e & where(is.numeric)), na.rm = TRUE), 
         avg = mean(c_across(a:e & where(is.numeric)), na.rm = TRUE)
  )

# A tibble: 5 x 8
# Rowwise: 
      a     b     c d         e   min   max   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1     1    11  4.75
2     2     7    12 b         2     2    12  5.75
3     3     8    13 c         3     3    13  6.75
4     4     9    14 d         4     4    14  7.75
5     5    10    15 e         5     5    15  8.75

Earlier Answer Your this will work won't even work properly, if you change the output sequence, see

Click to copy

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         min = min(unlist(cur_data()), na.rm = TRUE),
         max = max(unlist(cur_data()), na.rm = TRUE), 
         avg = mean(unlist(cur_data()), na.rm = TRUE)
  )

# A tibble: 5 x 7
# Rowwise: 
      a     b     c     e   min   max   avg
  <int> <int> <int> <int> <int> <int> <dbl>
1     1     6    11     1     1    11  5.17
2     2     7    12     2     2    12  6.17
3     3     8    13     3     3    13  7.17
4     4     9    14     4     4    14  8.17
5     5    10    15     5     5    15  9.17

Therefore, it is advised to do it like this-

Click to copy

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(min = min(c_across(a:e), na.rm = TRUE),
         max = max(c_across(a:e), na.rm = TRUE), 
         avg = mean(c_across(a:e), na.rm = TRUE)
  )

# A tibble: 5 x 7
# Rowwise: 
      a     b     c     e   min   max   avg
  <int> <int> <int> <int> <int> <int> <dbl>
1     1     6    11     1     1    11  4.75
2     2     7    12     2     2    12  5.75
3     3     8    13     3     3    13  6.75
4     4     9    14     4     4    14  7.75
5     5    10    15     5     5    15  8.75

One more alternative is

Click to copy

cols <- c('a', 'b', 'c', 'e')
df %>%
  rowwise() %>% 
  mutate(min = min(c_across(cols), na.rm = TRUE),
         max = max(c_across(cols), na.rm = TRUE), 
         avg = mean(c_across(cols), na.rm = TRUE)
  )

# A tibble: 5 x 8
# Rowwise: 
      a     b     c d         e   min   max   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1     1    11  4.75
2     2     7    12 b         2     2    12  5.75
3     3     8    13 c         3     3    13  6.75
4     4     9    14 d         4     4    14  7.75
5     5    10    15 e         5     5    15  8.75

Even @Sinh suggested approach of group_by won't work properly in these cases.

answered Oct 18 '22 03:10

AnilGoyal

Here is one method which would preserve the data.frame attribute in mutate if we want to set a particular column to row name attribute (column_to_rownames) and then return the attribute after the transformation

Click to copy

library(dplyr)
library(tibble)
library(purrr)
df %>% 
   column_to_rownames('d') %>%
   mutate(max = reduce(., pmax), min = reduce(., pmin), 
         avg = rowMeans(.)) %>% 
   rownames_to_column('d')
#  d a  b  c e max min  avg
#1 a 1  6 11 1  11   1 4.75
#2 b 2  7 12 2  12   2 5.75
#3 c 3  8 13 3  13   3 6.75
#4 d 4  9 14 4  14   4 7.75
#5 e 5 10 15 5  15   5 8.75

answered Oct 18 '22 02:10

akrun

Related questions
                            
                                Split a vector into chunks such that sum of each chunk is approximately constant
                            
                                Indent without adding a bullet point or number in RMarkdown
                            
                                Convert Excel numeric to date
                            
                                wrapping long geom_text labels
                            
                                How to correctly output Plotly plots in shiny?
                            
                                Using dplyr summarize with different operations for multiple columns
                            
                                All combinations of letters/numbers under specific conditions
                            
                                r - Convert output from sf::st_within to vector
                            
                                R - ggplot2 time series x-axis to show last day of the month
                            
                                Image output in shiny app
                            
                                Convert an integer to a string in R
                            
                                R Caret Package Error - At least one of the class levels is not a valid R variable name
                            
                                Replacing zeroes with NA for values preceding non-zero
                            
                                R - Fitting a grid over a City Map and inputting data into grid squares
                            
                                ggplot scale_color_manual with breaks does not match expected order
                            
                                Generate all unique combinations from a vector with repeating elements
                            
                                R: What are dates in a dates vector: dates or numeric values? (difference between x[i] and i)
                            
                                Clear R environment of all objetcs & packages
                            
                                Separate a shopping list into multiple columns
                            
                                How to create a geom line plot with single geom point at the end with legend

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Combine: rowwise(), mutate(), across(), for multiple functions

Tags:

r

dplyr

across

rowwise

TarJae

People also ask

Video Answer

3 Answers

tmfmnk

AnilGoyal

akrun

Recent Activity

Donate For Us