I would like to collapse a CIGAR vector to a CIGAR string. By CIGAR vector to String I mean the following:
I want a function that converts:
cigar.vector = c("M", "M", "I", "I", "M", "I", "", "M", "D", "D", "M", "I", "D", "M", "I")
to this:
cigar.string = "2M2I1M1I1M2D1M1I1D1M1I"
and viceversa.
Note that there is a "" (empty character), that does not count. thanks!
rle
seems the obvious choice here:
rcv <- rle(cigar.vector[cigar.vector!=""])
paste0(rcv$lengths,rcv$values,collapse="")
#[1] "2M2I1M1I1M2D1M1I1D1M1I"
If you want to get fancy, you could also exploit the fact that rle
gives a list of length 2:
paste(do.call(rbind,rle(cigar.vector[cigar.vector!=""])),collapse="")
#[1] "2M2I1M1I1M2D1M1I1D1M1I"
Going backwards will be impossible if only given the result (assign above to result
), as it has lost information for the ""
cases. Excluding those cases, you can get close enough with something like:
backwards <- rep(
unlist(strsplit(result,"\\d+"))[-1],
as.numeric(unlist(strsplit(result,"[^0-9]")))
)
identical(cigar.vector[cigar.vector!=""],backwards)
#[1] TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With