Mutate multiple columns in a dataframe

Tags:

I have a data set that looks like this.

bankname    bankid  year    totass  cash    bond    loans
Bank A      1       1881    244789  7250    20218   29513
Bank B      2       1881    195755  10243   185151  2800
Bank C      3       1881    107736  13357   177612  NA
Bank D      4       1881    170600  35000   20000   5000
Bank E      5       1881    3200000 351266  314012  NA

and I want to compute some ratios based on bank balance sheets. and I want the dataset to look like this

Click to copy

bankname    bankid  year    totass  cash    bond    loans   CashtoAsset BondtoAsset LoanstoAsset
Bank A      1       1881    2447890 7250    202100  951300  0.002   0.082   0.388
Bank B      2       1881    195755  10243   185151  2800    0.052   0.945   0.014
Bank C      3       1881    107736  13357   177612  NA  0.123   1.648585431 NA
Bank D      4       1881    170600  35000   20000   5000    0.205   0.117   0.029
Bank E      5       1881    32000000    351266  314012  NA  0.0109  0.009   NA

Here is the code to replicate the data

Click to copy

bankname <- c("Bank A","Bank B","Bank C","Bank D","Bank E")
bankid <- c( 1, 2,  3,  4,  5)
year<- c( 1881, 1881,   1881,   1881,   1881)
totass  <- c(244789,    195755, 107736, 170600, 32000000)
cash<-c(7250,10243,13357,35000,351266)
bond<-c(20218,185151,177612,20000,314012)
loans<-c(29513,2800,NA,5000,NA)
bankdata<-data.frame(bankname, bankid,year,totass, cash, bond, loans)

First, I got rid of NAs in balance sheets.

Click to copy

cols <- c("totass", "cash", "bond", "loans")
bankdata[cols][is.na(bankdata[cols])] <- 0

Then I compute ratios

Click to copy

library(dplyr)
bankdata<-mutate(bankdata,CashtoAsset = cash/totass)
bankdata<-mutate(bankdata,BondtoAsset = bond/totass)
bankdata<-mutate(bankdata,loanstoAsset =loans/totass)

But, instead of computing all these ratios line by line, I want to create a look to do this all at once. In Stata, I would do

Click to copy

foreach x of varlist cash bond loans {
by bankid: gen `x'toAsset = `x'/ totass
}

How would I do this?

759

asked Oct 06 '14 15:10

H Park

2 Answers

Update (as of the 18th of March, 2019)

There has been a change. We have been using funs() in .funs (funs(name = f(.)). But this is changed (dplyr 0.8.0 above). Instead of funs, now we use list (list(name = ~f(.))). See the following new examples.

Click to copy

bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = vars(cash:loans))

bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = c("cash", "bond", "loans"))

bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = 5:7)

Update (as of the 2nd of December, 2017)

Since I answered this question, I have realized that some SO users have been checking this answer. The dplyr package has changed since then. Therefore, I leave the following update. I hope this will help some R users to learn how to use mutate_at().

mutate_each() is now deprecated. You want to use mutate_at(), instead. You can specify which columns you want to apply your function in .vars. One way is to use vars(). Another is to use a character vector containing column names, which you want to apply your custom function in .fun. The other is to specify columns with numbers (e.g., 5:7 in this case). Note that, if you use a column for group_by(), you need to change the numbers of column positions. Have a look of this question.

Click to copy

bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = vars(cash:loans))

bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = c("cash", "bond", "loans"))

bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = 5:7)

#  bankname bankid year   totass   cash   bond loans cash_toAsset bond_toAsset loans_toAsset
#1   Bank A      1 1881   244789   7250  20218 29513   0.02961734  0.082593581    0.12056506
#2   Bank B      2 1881   195755  10243 185151  2800   0.05232561  0.945830247    0.01430359
#3   Bank C      3 1881   107736  13357 177612    NA   0.12397899  1.648585431            NA
#4   Bank D      4 1881   170600  35000  20000  5000   0.20515826  0.117233294    0.02930832
#5   Bank E      5 1881 32000000 351266 314012    NA   0.01097706  0.009812875            NA

I purposely gave toAsset to the custom function in .fun since this will help me to arrange new column names. Previously, I used rename(). But I think it is much easier to clean up column names with gsub() in the present approach. If the above result is saved as out, you want to run the following code in order to remove _ in the column names.

Click to copy

names(out) <- gsub(names(out), pattern = "_", replacement = "")

Original answer

I think you can save some typing in this way with dplyr. The downside is you overwrite cash, bond, and loans.

Click to copy

bankdata %>%
    group_by(bankname) %>%
    mutate_each(funs(whatever = ./totass), cash:loans)

#  bankname bankid year   totass       cash        bond      loans
#1   Bank A      1 1881   244789 0.02961734 0.082593581 0.12056506
#2   Bank B      2 1881   195755 0.05232561 0.945830247 0.01430359
#3   Bank C      3 1881   107736 0.12397899 1.648585431         NA
#4   Bank D      4 1881   170600 0.20515826 0.117233294 0.02930832
#5   Bank E      5 1881 32000000 0.01097706 0.009812875         NA

If you prefer your expected outcome, I think some typing is necessary. The renaming part seems to be something you gotta do.

Click to copy

bankdata %>%
    group_by(bankname) %>%
    summarise_each(funs(whatever = ./totass), cash:loans) %>%
    rename(cashtoAsset = cash, bondtoAsset = bond, loanstoAsset = loans) -> ana;
    ana %>%
    merge(bankdata,., by = "bankname")

#  bankname bankid year   totass   cash   bond loans cashtoAsset bondtoAsset loanstoAsset
#1   Bank A      1 1881   244789   7250  20218 29513  0.02961734 0.082593581   0.12056506
#2   Bank B      2 1881   195755  10243 185151  2800  0.05232561 0.945830247   0.01430359
#3   Bank C      3 1881   107736  13357 177612    NA  0.12397899 1.648585431           NA
#4   Bank D      4 1881   170600  35000  20000  5000  0.20515826 0.117233294   0.02930832
#5   Bank E      5 1881 32000000 351266 314012    NA  0.01097706 0.009812875           NA

156

answered Oct 08 '22 12:10

jazzurro

Here is a data.table solution.

Click to copy

library(data.table)
setDT(bankdata)
bankdata[, paste0(names(bankdata)[5:7], "toAsset") := 
           lapply(.SD, function(x) x/totass), .SDcols=5:7]
bankdata
#    bankname bankid year   totass   cash   bond loans cashtoAsset bondtoAsset loanstoAsset
# 1:   Bank A      1 1881   244789   7250  20218 29513  0.02961734 0.082593581   0.12056506
# 2:   Bank B      2 1881   195755  10243 185151  2800  0.05232561 0.945830247   0.01430359
# 3:   Bank C      3 1881   107736  13357 177612     0  0.12397899 1.648585431   0.00000000
# 4:   Bank D      4 1881   170600  35000  20000  5000  0.20515826 0.117233294   0.02930832
# 5:   Bank E      5 1881 32000000 351266 314012     0  0.01097706 0.009812875   0.00000000

answered Oct 08 '22 11:10

KFB

Related questions
                            
                                How do I print the variance of an lm in R without computing from the Standard Error by hand?
                            
                                Partially color histogram in R
                            
                                regular expression excluding word in R
                            
                                Create appendix with R-code in rmarkdown/knitr
                            
                                remove duplicate values based on 2 columns
                            
                                Fastest way to find nearest value in vector
                            
                                create a boxplot in R that labels a box with the sample size (N)
                            
                                In R, what does a negative index do?
                            
                                Warning: replacing previous import ‘head’ when loading ‘utils’ in R
                            
                                Create barplot from data.frame
                            
                                Creating zip file from folders in R
                            
                                Is R an interpreted or compiled programming language?
                            
                                Get only the value of an element in an R data frame (without the index)
                            
                                R: generate all permutations of vector without duplicated elements
                            
                                Is there a way to programmatically darken the color given RGB values?
                            
                                Extract name of data.frame in R as character
                            
                                r - ggplot2 - highlighting selected points and strange behavior
                            
                                Change negative values in dataframe column to absolute value
                            
                                Changing facet label to math formula in ggplot2
                            
                                Adaptive moving average - top performance in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Mutate multiple columns in a dataframe

Tags:

r

dplyr

stata

H Park

People also ask

2 Answers

Update (as of the 18th of March, 2019)

Update (as of the 2nd of December, 2017)

Original answer

jazzurro

KFB

Recent Activity

Donate For Us