Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String as formula

Tags:

r

dplyr

I've tried searching through the forums and unable to find help. I'm quite new to R, and am having limited success in loading in some strings to be used as a formula.

I have a csv with the following format:

            Sam, Frank, Dennis, Steven, Robert, Georgia
Region_1    218,    763,    811,    812,    797,    574
Region_2    474,    983,    343,    697,    310,    335
Region_3    700,    813,    133,    212,    906,    680
Region_4    212,    581,    893,    514,    530,    795

And am able to load this in and add additional columns to my data frame that adds specific columns (ie. Sam + Frank, Dennis + Georgia).

I can do it if I hard code the teams into my code, eg:

temp <- mutate(temp, team_1 = Sam + Robert) 

My issue is that I have another csv that contains a list of 2 columns, and these are teams that changes daily, eg:

Team 1  Sam + Robert
Team 2  Frank + Georgia
Team 3  Frank + Steven
Team 4  Robert + Dennis
Team 5  Frank + Sam

Is there a way that I can load this csv list into R and add these team member columns together.

But I can't see to add this list in from the csv file and have a new data frame with the total of team members dynamically.

Is anyone able to help me with this?

I'm looking to end up with something like this (with the sum of the team members (team 1, team 2, etc from the second csv file) as additional columns:

         Sam,   Frank,  Dennis, Steven, Robert, Georgia,    Sam + Robert,   Frank + Georgia,    Frank + Steven, Robert + Dennis,    Frank + Sam
 Region_1   218,    763,    811,    812,    797,    574,    981,    1337,   1575,   1608,   981
 Region_2   474,    983,    343,    697,    310,    335,    1457,   1318,   1680,   653,    1457
 Region_3   700,    813,    133,    212,    906,    680,    1513,   1493,   1025,   1039,   1513
 Region_4   212,    581,    893,    514,    530,    795,    793,    1376,   1095,   1423,   793

With much thanks.

like image 241
nicshah Avatar asked Sep 26 '18 03:09

nicshah


People also ask

How do you turn a string into a formula?

For a non-VBA solution, use the INDIRECT formula. It takes a string as an argument and converts it to a cell reference. For example, =0.4*INDIRECT("A1") will return the value of 0.4 * the value that's in cell A1 of that worksheet. If cell A1 was, say, 10, then =0.4*INDIRECT("A1") would return 4.

How do you enter a string as a formula in Excel?

The easiest way to add a text string to a cell is to use an ampersand character (&), which is the concatenation operator in Excel. This works in all versions of Excel 2007 - Excel 365.

Can a text be used in a formula?

The data converted into text cannot be used for calculations. If needed, we should keep the original data in a hidden format and use it for other formulas.

Is there a string function in Excel?

Excel String Functions – SummaryExcel string functions can be used to retrieve specific characters within a string.


2 Answers

Working with code as strings is sort of complicated. Here, it's simpler to turn the second data frame into a lookup table which can be joined to the first reshaped to long form. The result can then be summarized and reshaped into whatever form you'd like.

library(tidyverse)

people <- data.frame(
    region = c("Region_1", "Region_2", "Region_3", "Region_4"),
    Sam = c(218L, 474L, 700L, 212L),
    Frank = c(763L, 983L, 813L, 581L),
    Dennis = c(811L, 343L, 133L, 893L),
    Steven = c(812L, 697L, 212L, 514L),
    Robert = c(797L, 310L, 906L, 530L),
    Georgia = c(574L, 335L, 680L, 795L), 
    stringsAsFactors = FALSE
)

teams <- data.frame(
    team = c("Team 1", "Team 2", "Team 3", "Team 4", "Team 5"),
    members = c("Sam + Robert", "Frank + Georgia", "Frank + Steven", "Robert + Dennis", "Frank + Sam"), 
    stringsAsFactors = FALSE
)

teams_long <- teams %>% 
    mutate(team = members) %>%    # set team name
    separate_rows(members)    # separate teams to names

people %>% 
    gather(member, value, -region) %>%    # reshape to long form
    left_join(teams_long, c('member' = 'members')) %>%    # join by names
    group_by(region, team) %>% 
    summarise(value = sum(value)) %>%    # aggregate
    spread(team, value) %>%    # reshape to wide form
    left_join(people, .)    # join back to original data
#>     region Sam Frank Dennis Steven Robert Georgia Frank + Georgia
#> 1 Region_1 218   763    811    812    797     574            1337
#> 2 Region_2 474   983    343    697    310     335            1318
#> 3 Region_3 700   813    133    212    906     680            1493
#> 4 Region_4 212   581    893    514    530     795            1376
#>   Frank + Sam Frank + Steven Robert + Dennis Sam + Robert
#> 1         981           1575            1608         1015
#> 2        1457           1680             653          784
#> 3        1513           1025            1039         1606
#> 4         793           1095            1423          742
like image 78
alistaire Avatar answered Sep 26 '22 07:09

alistaire


Here is an option using rlang::parse_expr to directly parse the expression from the second data.frame

df1 %>% bind_cols(map_dfc(df2$V2, ~df1 %>%
    transmute(!!as.character(.x) := !!rlang::parse_expr(as.character(.x)))))
#  Sam Frank Dennis Steven Robert Georgia Sam + Robert Frank + Georgia
#1 218   763    811    812    797     574         1015            1337
#2 474   983    343    697    310     335          784            1318
#3 700   813    133    212    906     680         1606            1493
#4 212   581    893    514    530     795          742            1376
#  Frank + Steven Robert + Dennis Frank + Sam
#1           1575            1608         981
#2           1680             653        1457
#3           1025            1039        1513
#4           1095            1423         793

Or you can name the new columns according to the team:

df1 %>% bind_cols(map2_dfc(df2$V1, df2$V2, ~df1 %>%
    transmute(!!as.character(.x) := !!rlang::parse_expr(as.character(.y)))))
#  Sam Frank Dennis Steven Robert Georgia Team 1 Team 2 Team 3 Team 4 Team 5
#1 218   763    811    812    797     574   1015   1337   1575   1608    981
#2 474   983    343    697    310     335    784   1318   1680    653   1457
#3 700   813    133    212    906     680   1606   1493   1025   1039   1513
#4 212   581    893    514    530     795    742   1376   1095   1423    793

The idea is that we use map (or map2) to apply the "formula" stored in df2 within a transmute context of df1. The expression gets evaluated with !!parse_expr(...).


Sample data

df1 <- read.table(text =
    "Sam Frank Dennis Steven Robert Georgia
Region_1    218    763    811    812    797    574
Region_2    474    983    343    697    310    335
Region_3    700    813    133    212    906    680
Region_4    212    581    893    514    530    795", header = T)


df2 <- read.table(text =
    "'Team 1'  'Sam + Robert'
'Team 2'  'Frank + Georgia'
'Team 3'  'Frank + Steven'
'Team 4'  'Robert + Dennis'
'Team 5'  'Frank + Sam'", header = F)
like image 35
Maurits Evers Avatar answered Sep 22 '22 07:09

Maurits Evers