I have the following data frame <code>df</code>: <pre class="prettyprint"><code> v1 v2 v3 v4 1 1 5 7 4 2 2 6 10 3 </code></pre> And I want to obtain the following data frame <code>df2</code> multiplying columns v1*v3 and v2*v4: <pre class="prettyprint"><code> v1 v2 v3 v4 v1v3 v2v4 1 1 5 7 4 7 20 2 2 6 10 3 20 18 </code></pre> How can I do that using <code>dplyr</code>? Using <code>mutate_each</code>? I need a solution that can be generalized to a large number of variables and not only 4 (v1 to v4). This is the code to generate the example: <pre class="prettyprint"><code>v1 <- c(1, 2) v2 <- c(5,6) v3 <- c(7, 10) v4 <- c(4, 3) df <- data.frame(v1, v2, v3, v4) v1v3 <- c(v1 * v3) v2v4 <- c(v2 * v4) df2 <- cbind(df, v1v3, v2v4) </code></pre>

You are really close. <pre class="prettyprint"><code>df2 <- df %>% mutate(v1v3 = v1 * v3, v2v4 = v2 * v4) </code></pre> such a beautifully simple language, right? For more great tricks please see here. EDIT: Thanks to @Facottons pointer to this answer: https://stackoverflow.com/a/34377242/5088194, here is a tidy approach to resolving this issue. It keeps one from having to write a line to hard code in each new column desired. While it is a bit more verbose than the Base R approach, the logic is at least more immediately transparent/readable. It is also worth noting that there must be at least half as many rows as there are columns for this approach to work. <pre class="prettyprint"><code># prep the product column names (also acting as row numbers) df <- df %>% mutate(prod_grp = paste0("v", row_number(), "v", row_number() + 2)) # converting data to tidy format and pairing columns to be multiplied together. tidy_df <- df %>% gather(column, value, -prod_grp) %>% mutate(column = as.numeric(sub("v", "", column)), pair = column - 2) %>% mutate(pair = if_else(pair < 1, pair + 2, pair)) # summarize the products for each column prod_df <- tidy_df %>% group_by(prod_grp, pair) %>% summarize(val = prod(value)) %>% spread(prod_grp, val) %>% mutate(pair = paste0("v", pair, "v", pair + 2)) %>% rename(prod_grp = pair) # put the original frame and summary frames together final_df <- df %>% left_join(prod_df) %>% select(-prod_grp) </code></pre>

Mutating multiple columns in a data frame using dplyr

Tags:

r

dplyr

I have the following data frame df:

  v1 v2 v3 v4
1  1  5  7  4
2  2  6 10  3

And I want to obtain the following data frame df2 multiplying columns v1*v3 and v2*v4:

  v1 v2 v3 v4 v1v3 v2v4
1  1  5  7  4    7   20
2  2  6 10  3   20   18

How can I do that using dplyr? Using mutate_each?

I need a solution that can be generalized to a large number of variables and not only 4 (v1 to v4). This is the code to generate the example:

v1 <- c(1, 2)
v2 <- c(5,6)
v3 <- c(7, 10)
v4 <- c(4, 3)
df <- data.frame(v1, v2, v3, v4)
v1v3 <- c(v1 * v3)
v2v4 <- c(v2 * v4)
df2 <- cbind(df, v1v3, v2v4)

883

asked Nov 09 '16 16:11

sbac

1 Answers

You are really close.

df2 <- 
    df %>% 
    mutate(v1v3 = v1 * v3,
           v2v4 = v2 * v4)

such a beautifully simple language, right?

For more great tricks please see here.

EDIT: Thanks to @Facottons pointer to this answer: https://stackoverflow.com/a/34377242/5088194, here is a tidy approach to resolving this issue. It keeps one from having to write a line to hard code in each new column desired. While it is a bit more verbose than the Base R approach, the logic is at least more immediately transparent/readable. It is also worth noting that there must be at least half as many rows as there are columns for this approach to work.

# prep the product column names (also acting as row numbers)
df <- 
    df %>%
    mutate(prod_grp = paste0("v", row_number(), "v", row_number() + 2)) 

# converting data to tidy format and pairing columns to be multiplied together.
tidy_df <- 
    df %>%
    gather(column, value, -prod_grp) %>% 
    mutate(column = as.numeric(sub("v", "", column)),
           pair = column - 2) %>% 
    mutate(pair = if_else(pair < 1, pair + 2, pair))

# summarize the products for each column
prod_df <- 
    tidy_df %>% 
    group_by(prod_grp, pair) %>% 
    summarize(val = prod(value)) %>% 
    spread(prod_grp, val) %>% 
    mutate(pair = paste0("v", pair, "v", pair + 2)) %>% 
    rename(prod_grp = pair)

# put the original frame and summary frames together
final_df <- 
    df %>% 
    left_join(prod_df) %>% 
    select(-prod_grp)

156

answered Sep 29 '22 07:09

leerssej

Related questions
                            
                                Increase performance by moving away from a for loop
                            
                                Shiny splitLayout and selectInput issue
                            
                                tidyverse: binding list elements of same dimension
                            
                                ggplot2 footnote
                            
                                How to identify which columns are not "NA" per row in a matrix?
                            
                                How can I use a graphic imported with grImport as axis tick labels in ggplot2 (using grid functions)?
                            
                                Get the strings before the comma with R
                            
                                How to get the confidence intervals for LOWESS fit using R?
                            
                                R shiny Observe running Before loading of UI and this causes Null parameters
                            
                                Converting date in Year.decimal form in R
                            
                                How to convert time difference into minutes in R?
                            
                                Replicating rows in data.table by column value
                            
                                Convert a list into a string
                            
                                Collapsing / hiding figures in R markdown
                            
                                How to stop bookdown tables from floating to bottom of the page in pdf?
                            
                                Why does as.factor return a character when used inside apply?
                            
                                read.csv row.names
                            
                                How to create a KML file using R
                            
                                SI prefixes in ggplot2 axis labels
                            
                                Combine two data frames with the same column names

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With