Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Paste together some string vector elements based on list of indexes

I have a string vector like this:

x <- c("ermanaric cayce nonwashable climactical outseeing dorr nubble",
       "aver unsegregating preprofess lumme noontime triskele",
       "riverbank walachian penza",
       "schlieren calthrop",
       "hutlike paraphyllium unservile chaplainship bordelaise",
       "phlogotic strategics jowlier orthopaedic nonprofiteering",
       "vizir rudenture shopkeeper",
       "interestuarine sardis",
       "anthas figuring",
       "unphased engle german emporium organometallic didy uneclipsing",
       "bronzy conant reballot",
       "extrados facinorous acrolithic",
       "paralyzation uningratiating enzymatically enuresis",
       "unscholastic extemporarily",
       "discipleship fossilize summae",
       "concretize intercharge palpate gombroon initiatrices",
       "intimation progressiveness",
       "unpictorialise",
       "romanticization",
       "wynnewood",
       "unmate libratory polysynthetic")

Some of the elements need to be pasted together to form longer strings. I have a list of vectors that contains indexes of the elements that have to be pasted together:

indx <- list(c(3, 4), c(7, 8, 9), c(11, 12), c(14, 15), c(17, 18, 19, 20))

That is, the 3rd and the 4th element have to be pasted together to form the string "riverbank walachian penza schlieren calthrop", the 7th, 8th and 9th string to be pasted together to form the string "vizir rudenture shopkeeper interestuarine sardis anthas figuring" and so on (EDIT), keeping the rest of the strings in the same order. The resulting vector of strings would look like this:

y <- c("ermanaric cayce nonwashable climactical outseeing dorr nubble",
       "aver unsegregating preprofess lumme noontime triskele",
       "riverbank walachian penza schlieren calthrop",
       "hutlike paraphyllium unservile chaplainship bordelaise",
       "phlogotic strategics jowlier orthopaedic nonprofiteering",
       "vizir rudenture shopkeeper interestuarine sardis anthas figuring",
       "unphased engle german emporium organometallic didy uneclipsing",
       "bronzy conant reballot extrados facinorous acrolithic",
       "paralyzation uningratiating enzymatically enuresis",
       "unscholastic extemporarily discipleship fossilize summae",
       "concretize intercharge palpate gombroon initiatrices",
       "intimation progressiveness unpictorialise romanticization wynnewood",
       "unmate libratory polysynthetic")

I tried the following without any success:

myfun <- function(obj, indx) {
  paste(obj)[length(indx)]
}

mapply(myfun, x, m)

Can someone help?

like image 603
panman Avatar asked Feb 10 '23 03:02

panman


2 Answers

The fact that indx does not contain an entry for each item in x but you want each returned or merged somewhere makes this a bit more challenging.

One idea would be to successively update with Reduce, using temporary NAs to maintain the correspondence with index numbers.

my.y<-c(na.omit(Reduce(function(s,i) 
  replace(s,i,c(paste(s[i],collapse=" "),rep(NA,length(i)-1))),indx,x)))

identical(my.y,y)
#> [1] TRUE

Matches the desired output.

like image 69
A. Webb Avatar answered Feb 11 '23 20:02

A. Webb


indx <- list(c(3, 4), c(7, 8, 9), c(11, 12), c(14, 15), c(17, 18, 19, 20))
indx1 <- c(lapply(setdiff(1:length(x),unlist(indx)),c),indx)

indx2 <- indx1[order(sapply(indx1,"[[",1))]

sapply(indx2,function(z) {paste(x[z],collapse = " ")})
like image 25
Rohit Das Avatar answered Feb 11 '23 22:02

Rohit Das