Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a new column to dataframe using mutate_ where column name is specified by a variable

Tags:

dataframe

r

dplyr

I have a dataframe, that I want to add a column to, where the column is defined by a variable name:

df <- diamonds
NewName <- "SomeName"
df <- df %>% mutate_(paste0(NewName," = \"\""))

This gives me the following error:

Error: attempt to use zero-length variable name

I've seen plenty of examples of mutate_ being used to change column names, but not to dynamically create columns. Any help?

like image 936
16 revs, 12 users 31% Avatar asked Oct 18 '22 00:10

16 revs, 12 users 31%


1 Answers

The issue has to do with when the evaluation of the statement is occurring. By my understanding, the goal of mutate_ is not to recreate the syntax of mutate, for example using paste to create mutate(SomeName = ""). Instead, it is to allow generation of functions to pass. The reason your approach is failing is (I believe) the fact that it is looking for a function named "".

Instead, you need to pass in a function that can be evaluated (here, I am using paste as a placeholder) and set the name of that column using your variable. This should work:

df <- diamonds
NewName <- "SomeName"
df <- df %>% mutate_(.dots = setNames("paste('')",NewName))

This also allows more control, for example, you could paste cut and color:

df <- df %>% mutate_(.dots = setNames("paste(cut, color)",NewName))

gives:

   carat       cut color clarity depth table price     x     y     z    SomeName
   <dbl>     <ord> <ord>   <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>       <chr>
1   0.23     Ideal     E     SI2  61.5    55   326  3.95  3.98  2.43     Ideal E
2   0.21   Premium     E     SI1  59.8    61   326  3.89  3.84  2.31   Premium E
3   0.23      Good     E     VS1  56.9    65   327  4.05  4.07  2.31      Good E
4   0.29   Premium     I     VS2  62.4    58   334  4.20  4.23  2.63   Premium I
5   0.31      Good     J     SI2  63.3    58   335  4.34  4.35  2.75      Good J
6   0.24 Very Good     J    VVS2  62.8    57   336  3.94  3.96  2.48 Very Good J
7   0.24 Very Good     I    VVS1  62.3    57   336  3.95  3.98  2.47 Very Good I
8   0.26 Very Good     H     SI1  61.9    55   337  4.07  4.11  2.53 Very Good H
9   0.22      Fair     E     VS2  65.1    61   337  3.87  3.78  2.49      Fair E
10  0.23 Very Good     H     VS1  59.4    61   338  4.00  4.05  2.39 Very Good H

(Of note, I also got the initial syntax to work the first time, followed by subsequent failures. Worth digging into.)

like image 192
Mark Peterson Avatar answered Oct 21 '22 01:10

Mark Peterson