Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting alternating sequence from vector in R

Tags:

r

vector

sequence

I have a data looking like the following:

A= c(0,0,0,-1,0,0,0,1,1,1,0,0,-1,0,0,-1,-1,1,1,1,-1,0,0,0,-1,0,0,-1,-1,1,1,0,0,0,0,1,-1)

The goal is to extract alternating -1s and 1s. I want to make a function where the input vector contains 0,1, and -1. The output ideally spits out all the 0s and alternating -1s and 1s.

For instance, the desired output for the above example is:

 B= c(0,0,0,-1,0,0,0,1,0,0,0,0,-1,0,0,0,0,1,0,0,-1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,-1)

The two 1s in the 9th and 10th location in A is turned to 0 because we only keep the first 1 or -1 appearing. The -1s in 16th and 17th location of A is turned to 0 for this reason as well.

Anyone have a good idea for making such a function?

like image 881
jay2020 Avatar asked Dec 25 '22 09:12

jay2020


1 Answers

Identify positions of nonzero values:

w = which(A != 0)

For each run of similar values, in A[w], take the position of the first:

library(data.table)
wkeep = tapply(w, rleid(A[w]), FUN = function(x) x[1])

Set all other values to zero:

# following @alexis_laz's approach
B = numeric(length(A)) 
B[ wkeep ] = A[ wkeep ]

This way, you don't have to make comparisons in a loop, which R is slow at, I think.


rleid comes from data.table. With base R, you can make wkeep with @alexis_laz's suggestion:

wkeep = w[c(TRUE, A[w][-1L] != A[w][-length(w)])]

Or write your own rleid, as in Josh's answer.

like image 101
Frank Avatar answered Jan 08 '23 21:01

Frank