I've tried searching through the forums and unable to find help. I'm quite new to R, and am having limited success in loading in some strings to be used as a formula.
I have a csv with the following format:
Sam, Frank, Dennis, Steven, Robert, Georgia
Region_1 218, 763, 811, 812, 797, 574
Region_2 474, 983, 343, 697, 310, 335
Region_3 700, 813, 133, 212, 906, 680
Region_4 212, 581, 893, 514, 530, 795
And am able to load this in and add additional columns to my data frame that adds specific columns (ie. Sam + Frank, Dennis + Georgia).
I can do it if I hard code the teams into my code, eg:
temp <- mutate(temp, team_1 = Sam + Robert)
My issue is that I have another csv that contains a list of 2 columns, and these are teams that changes daily, eg:
Team 1 Sam + Robert
Team 2 Frank + Georgia
Team 3 Frank + Steven
Team 4 Robert + Dennis
Team 5 Frank + Sam
Is there a way that I can load this csv list into R and add these team member columns together.
But I can't see to add this list in from the csv file and have a new data frame with the total of team members dynamically.
Is anyone able to help me with this?
I'm looking to end up with something like this (with the sum of the team members (team 1, team 2, etc from the second csv file) as additional columns:
Sam, Frank, Dennis, Steven, Robert, Georgia, Sam + Robert, Frank + Georgia, Frank + Steven, Robert + Dennis, Frank + Sam
Region_1 218, 763, 811, 812, 797, 574, 981, 1337, 1575, 1608, 981
Region_2 474, 983, 343, 697, 310, 335, 1457, 1318, 1680, 653, 1457
Region_3 700, 813, 133, 212, 906, 680, 1513, 1493, 1025, 1039, 1513
Region_4 212, 581, 893, 514, 530, 795, 793, 1376, 1095, 1423, 793
With much thanks.
For a non-VBA solution, use the INDIRECT formula. It takes a string as an argument and converts it to a cell reference. For example, =0.4*INDIRECT("A1") will return the value of 0.4 * the value that's in cell A1 of that worksheet. If cell A1 was, say, 10, then =0.4*INDIRECT("A1") would return 4.
The easiest way to add a text string to a cell is to use an ampersand character (&), which is the concatenation operator in Excel. This works in all versions of Excel 2007 - Excel 365.
The data converted into text cannot be used for calculations. If needed, we should keep the original data in a hidden format and use it for other formulas.
Excel String Functions – SummaryExcel string functions can be used to retrieve specific characters within a string.
Working with code as strings is sort of complicated. Here, it's simpler to turn the second data frame into a lookup table which can be joined to the first reshaped to long form. The result can then be summarized and reshaped into whatever form you'd like.
library(tidyverse)
people <- data.frame(
region = c("Region_1", "Region_2", "Region_3", "Region_4"),
Sam = c(218L, 474L, 700L, 212L),
Frank = c(763L, 983L, 813L, 581L),
Dennis = c(811L, 343L, 133L, 893L),
Steven = c(812L, 697L, 212L, 514L),
Robert = c(797L, 310L, 906L, 530L),
Georgia = c(574L, 335L, 680L, 795L),
stringsAsFactors = FALSE
)
teams <- data.frame(
team = c("Team 1", "Team 2", "Team 3", "Team 4", "Team 5"),
members = c("Sam + Robert", "Frank + Georgia", "Frank + Steven", "Robert + Dennis", "Frank + Sam"),
stringsAsFactors = FALSE
)
teams_long <- teams %>%
mutate(team = members) %>% # set team name
separate_rows(members) # separate teams to names
people %>%
gather(member, value, -region) %>% # reshape to long form
left_join(teams_long, c('member' = 'members')) %>% # join by names
group_by(region, team) %>%
summarise(value = sum(value)) %>% # aggregate
spread(team, value) %>% # reshape to wide form
left_join(people, .) # join back to original data
#> region Sam Frank Dennis Steven Robert Georgia Frank + Georgia
#> 1 Region_1 218 763 811 812 797 574 1337
#> 2 Region_2 474 983 343 697 310 335 1318
#> 3 Region_3 700 813 133 212 906 680 1493
#> 4 Region_4 212 581 893 514 530 795 1376
#> Frank + Sam Frank + Steven Robert + Dennis Sam + Robert
#> 1 981 1575 1608 1015
#> 2 1457 1680 653 784
#> 3 1513 1025 1039 1606
#> 4 793 1095 1423 742
Here is an option using rlang::parse_expr
to directly parse the expression from the second data.frame
df1 %>% bind_cols(map_dfc(df2$V2, ~df1 %>%
transmute(!!as.character(.x) := !!rlang::parse_expr(as.character(.x)))))
# Sam Frank Dennis Steven Robert Georgia Sam + Robert Frank + Georgia
#1 218 763 811 812 797 574 1015 1337
#2 474 983 343 697 310 335 784 1318
#3 700 813 133 212 906 680 1606 1493
#4 212 581 893 514 530 795 742 1376
# Frank + Steven Robert + Dennis Frank + Sam
#1 1575 1608 981
#2 1680 653 1457
#3 1025 1039 1513
#4 1095 1423 793
Or you can name the new columns according to the team:
df1 %>% bind_cols(map2_dfc(df2$V1, df2$V2, ~df1 %>%
transmute(!!as.character(.x) := !!rlang::parse_expr(as.character(.y)))))
# Sam Frank Dennis Steven Robert Georgia Team 1 Team 2 Team 3 Team 4 Team 5
#1 218 763 811 812 797 574 1015 1337 1575 1608 981
#2 474 983 343 697 310 335 784 1318 1680 653 1457
#3 700 813 133 212 906 680 1606 1493 1025 1039 1513
#4 212 581 893 514 530 795 742 1376 1095 1423 793
The idea is that we use map
(or map2
) to apply the "formula" stored in df2
within a transmute
context of df1
. The expression gets evaluated with !!parse_expr(...)
.
df1 <- read.table(text =
"Sam Frank Dennis Steven Robert Georgia
Region_1 218 763 811 812 797 574
Region_2 474 983 343 697 310 335
Region_3 700 813 133 212 906 680
Region_4 212 581 893 514 530 795", header = T)
df2 <- read.table(text =
"'Team 1' 'Sam + Robert'
'Team 2' 'Frank + Georgia'
'Team 3' 'Frank + Steven'
'Team 4' 'Robert + Dennis'
'Team 5' 'Frank + Sam'", header = F)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With