Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R dplyr choose value from column with column name to choose in a separate column

Tags:

r

dplyr

Hi I have a data frame with different data columns (say x, y, z ) and another column which specifies which one to choose. I want to use dplyr::mutate (or similar) to make a new column that has the value corresponding to value in the column specified by "choose". But I want to keep all the columns in the first place. In my real data I also have a few other columns with metadata.

Example data:

library(dplyr)
testdf <- data.frame(x = 1:5, y = 11:15, z = 101:105, choose = c("z","y","x","y","z"))

I can make this work in my example using case_when but in my actual script the column names and choose column are generated and may have different values, so I don't want to hardcode what names there could be.

Desired output/test

mutate(testdf, selectedValue = case_when(choose == "x" ~x,
                                     choose == "y"~ y,
                                     choose == "z"~ z, T~NA_integer_))

#>   x  y   z choose selectedValue
#> 1 1 11 101      z           101
#> 2 2 12 102      y            12
#> 3 3 13 103      x             3
#> 4 4 14 104      y            14
#> 5 5 15 105      z           105

Created on 2019-09-18 by the reprex package (v0.3.0)

like image 263
Sarah Avatar asked Sep 18 '19 01:09

Sarah


People also ask

How do I select a column with specific names in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How do you subset a Dataframe in R based on column names?

3.1 Subset by Column Name Let's use the same df[] notation and subset() function to subset the data frame by column name in R. To subset columns use select argument with values as column names to subset() .

How do I extract part of a column name in R?

To extract columns with a particular string in column name of an R data frame, we can use grepl function for column names and then subset the data frame with single square brackets.

How do I select a column based on a condition in R?

select_if(): Select columns based on a particular condition. One can use this function to, for example, select columns if they are numeric. Helper functions - starts_with(), ends_with(), contains(), matches(), one_of(): Select columns/variables based on their names.


1 Answers

Here is a data.table solution. This is a case where I don't think the dplyr solutions are more human readable than other solutions (base r and data.table specifically).

library(data.table)
testdt <- data.table(x = 1:5, y = 11:15, z = 101:105, choose = c("z","y","x","y","z"))
testdt[,selectedValue := get(choose), by = choose]
testdt
#>    x  y   z choose selectedValue
#> 1: 1 11 101      z           101
#> 2: 2 12 102      y            12
#> 3: 3 13 103      x             3
#> 4: 4 14 104      y            14
#> 5: 5 15 105      z           105

Created on 2019-09-17 by the reprex package (v0.3.0)

like image 90
ClancyStats Avatar answered Sep 21 '22 18:09

ClancyStats