I would like some assistance please in my quest to select parts of a string in certain rows in an r dataframe. I have mocked up some dummy data below (floyd) to illustrate.
The first dataframe row has only 1 word (its a number yes, but I am treating all numbers as characters/words) for each column, but rows 2 to 4 have more than one word. I would like to select the number in each row/cell based on a position passed to it by the named vector cool_floyd_position.
# please NB need stringr installed for my solution attempt!
# some scenario data
floyd = data.frame(people = c("roger", "david", "rick", "nick"),
spec1 = c("1", "3 5 75 101", "3 65 85", "12 2"),
spec2 = c("45", "75 101 85 12", "45 65 8", "45 87" ),
spec3 = c("1", "3 5 75 101", "75 98 5", "65 32"))
# tweak my data
rownames(floyd) = floyd$people
floyd$people = NULL
# ppl of interest
cool_floyd = rownames(floyd)[2:4]
# ppl string position criteria
cool_floyd_position = c(2,3,1)
names(cool_floyd_position) = c("david", "rick", "nick")
# my solution attempt
for(i in 1:length(cool_floyd))
{
select_ppl = cool_floyd[i]
string_select = cool_floyd_position[i]
floyd[row.names(floyd) == select_ppl,] = apply(floyd[row.names(floyd) == select_ppl], 1,
function(x) unlist(stringr::str_split(x, " ")[string_select]))
}
I am attempting to get my floyd dataframe to look like the following, where the second word is selected for all david columns, the third word for all rick columns and the first word for all nick columns (roger columns have to just remain as is)
my_target_df = data.frame(people = c("roger", "david", "rick", "nick"),
spec1 = c("1", "5", "85", "12"),
spec2 = c("45", "101", "8", "45" ),
spec3 = c("1", "5", "5", "65"))
row.names(my_target_df) = my_target_df$people
my_target_df$people = NULL
Many thanks in advance!
Here is another option using mapply
library(stringr)
#convert the factor columns to character
floyd[] <- lapply(floyd, as.character)
#transpose the floyd, subset the columns, convert to data.frame
# use mapply to extract the `word` specified in the corresponding c1
#transpose and assign it back to the row in 'floyd'
floyd[names(c1),] <- t(mapply(function(x,y) word(x, y),
as.data.frame(t(floyd)[, names(c1)], stringsAsFactors=FALSE), c1))
floyd
# spec1 spec2 spec3
#roger 1 45 1
#david 5 101 5
#rick 85 8 5
#nick 12 45 65
where
c1 <- cool_floyd_position #just to avoid typing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With