I have a binary sample like this :
Z = c(0,0,0,1,0,1,1,1,0,1,0,0,1,0,1,0,1,1,1,0,1,0,1,0)
I would like to convert all the sequences of the length 4 in number, i.e :
I need to read my original binary sample and convert all the possible sequence of length 4 into numbers.
Example: The sequence 0000 will be 1, the sequence 0001 will be 2, the sequence 0010 will be 3, ..., the sequence 1111 will be 16.
The expected output should be a new sample formed by the numbers 1,2,3,...16 having the same length as the original sample :
Z = c(0,0,0,1,0,1,1,1,0,1,0,0,1,0,1,0,1,1,1,0,1,0,1,0)
Z1 = c(2,3,6,12,8,15,14,11,5,10,3,11,6,12,8,15,14,11,6,11)
How can I do that in R ?
Try :
z<-c(0,0,0,1,0,1,1,1,0,1,0,0,1,0,1,0,1,1,1,0,1,0,1,0)
y<-as.character(z)
z1<-sapply(1:(length(y)-3),function(x){strtoi(paste(y[x:(x+3)],collapse=''),2)+1})
[1] 2 3 6 12 8 15 14 11 5 10 3 6 11 6 12 8 15 14 11 6 11
The code works like this :
z as a character vector (y)strtoi functionThe strtoi function convert the number by specifying the base of the input number (here, 2 because it's binary). We add 1 because in binary 0000 equals 0 and not 1.
Note: the conversion to character is optional, you can directly do
sapply(1:(length(z)-3),function(x){strtoi(paste(z[x:(x+3)],collapse=''),2)+1})
it will also be faster to use vapply :
vapply(1:(length(z)-3),function(x){strtoi(paste(z[x:(x+3)],collapse=''),2)+1},FUN.VALUE=1)
Unit: microseconds
expr min lq mean median uq max neval cld
vapply 206.866 209.111 214.3936 210.0735 211.356 338.362 100 a
sapply 230.278 231.882 234.0249 232.8440 234.128 273.897 100 b
Here is another approach:
Z <- c(0,0,0,1,0,1,1,1,0,1,0,0,1,0,1,0,1,1,1,0,1,0,1,0)
Z.tmp <- embed(Z,4)
Z1 <- as.vector(Z.tmp %*% c(1,2,4,8) + 1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With