Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reducing the number of columns using a condition in R

Tags:

r

matrix

I have a big matrix with more than 1000 rows and 100 columns. In each row ONLY 6-10 columns having values and the rest are zeros. I want to create a matrix has only 5 columns that taking the values of the 5 consecutive columns in each row. For example:

A = structure(c(0, 1L, 6L, 0, 2L, 0, 2L, 0, 1L, 4L, 1L, 3L, 7L, 2L, 6L, 2L, 4L, 0, 3L, 0, 3L, 5L, 1L, 4L, 0, 4L, 6L, 2L, 0, 0, 5L, 0, 3L, 5L, 0, 0, 0, 4L, 6L, 7L, 0, 7L, 5L, 7L, 8L, 6L, 0, 0, 8L, 9L, 0, 0, 0, 9L, 1L, 0 , 0, 0, 0, 2L, 7L, 0, 2L, 0, 0, 1L, 8L, 4, 0, 0), .Dim = c(5L, 14L))

#A =
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,]    0    0    1    2    3    4    5    0    0     6     0     0     7     1
#[2,]    1    2    3    4    5    6    0    0    7     0     0     0     0     8
#[3,]    6    0    7    0    1    2    3    4    5     0     0     0     2     4
#[4,]    0    1    2    3    4    0    5    6    7     8     9     0     0     0
#[5,]    2    4    6    0    0    0    0    7    8     9     1     2     0     0

I want this matrix:

B = structure(c(1L, 1L, 1L, 5L, 7L, 2L, 2L, 2L, 6L, 8L, 3L, 3L, 3L, 7L, 9L, 4L, 4L, 4L, 8L, 1L, 5L, 5L, 5L, 9L, 2L), .Dim = c(5L, 5L))


#B = 
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    2    3    4    5
#[2,]    1    2    3    4    5
#[3,]    1    2    3    4    5
#[4,]    5    6    7    8    9
#[5,]    7    8    9    1    2

My code:

df = data.frame(A)
B = do.call(rbind, lapply(1:NROW(df), function(i) df[i,][(df[i,])!=0][1:5]))
# or
B = t(apply(X = df, MARGIN = 1, function(x) x[x!=0][1:5]))

My code works fine for the first two rows of A but fails for the rest of the rows. I also thought about getting the columns indexes that are none zeros and then to see if there are 5 consecutive columns (without any gap between them) and retrieve their values. Any help much appreciated!

like image 738
Zryan Avatar asked Oct 29 '22 03:10

Zryan


1 Answers

Here is an option using rle

t(apply(A, 1, function(x) {
      rl <- rle(x !=0)
    head(x[inverse.rle(within.list(rl, values[!(values & lengths >= 5)] <- FALSE))], 5)}))
#      [,1] [,2] [,3] [,4] [,5]
#[1,]    1    2    3    4    5
#[2,]    1    2    3    4    5
#[3,]    1    2    3    4    5
#[4,]    5    6    7    8    9
#[5,]    7    8    9    1    2
like image 110
akrun Avatar answered Nov 15 '22 07:11

akrun