Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Left shift columns in R

I have a data set like this

temp <- structure(list(col_1 = c("", "P9603", "", "", "11040", 
"80053"), col_2 = c("84484", "80061", "", "80061", "A0428", "85025"
), col_3 = c("V2632", "82310", "", "", "", "86357"), col_4 = c("J1170", 
"84305", "62311", "80061", "", ""), col_5 = c("", "86708", "J0690", 
"", "", "")), .Names = c("col_1", "col_2", "col_3", "col_4", 
"col_5"), class = c("data.table", "data.frame"))

   col_1 col_2 col_3 col_4 col_5
1:       84484 V2632 J1170      
2: P9603 80061 82310 84305 86708
3:                   62311 J0690
4:       80061       80061                        
5: 11040 A0428                  
6: 80053 85025 86357 

Is there a possibility to shift the columns like this

   col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170              #LEFT SHIFT 1
2: P9603 80061 82310 84305 86708  #NO CHANGE
3: 62311 J0690                    #LEFT SHIFT 3
4: 80061 80061                    #LEFT SHIFT 1 FOR FIRST ITEM, 
                                  #LEFT SHIFT 2 FOR 2ND ITEM 
5: 11040 A0428                    #NO CHANGE
6: 80053 85025 86357              #NO CHANGE

I am shifting columns left, if the value on left is empty

like image 637
Hardik Gupta Avatar asked Nov 25 '17 09:11

Hardik Gupta


People also ask

How do I move a column to a row in R?

Thus, to convert columns of an R data frame into rows we can use transpose function t. For example, if we have a data frame df with five columns and five rows then we can convert the columns of the df into rows by using as. data. frame(t(df)).

How do I move a column to the front in R?

To move a column to first in the dataframe, we use relocate() with the column name we want to move. This will move the column of interest to the first column. We can also move the column of interest to a location after another column in the dataframe.


2 Answers

Here is an option using data.table. Grouped by the sequence of rows, unlist the Subset of data.table (.SD), order by the logical vector (un==''), convert to list and then set the names with the original column names after removing the 'grp' column

setnames(temp[, {un <- unlist(.SD); as.list(un[order(un=='')])},
    .(grp = 1:nrow(temp))][, grp := NULL], names(temp))[]
#  col_1 col_2 col_3 col_4 col_5
#1: 84484 V2632 J1170            
#2: P9603 80061 82310 84305 86708
#3: 62311 J0690                  
#4: 80061 80061                  
#5: 11040 A0428                  
#6: 80053 85025 86357       

Or another option is to melt into long format after creating a sequence column, then dcast it to wide format

dcast(melt(temp[, n := seq_len(.N)], id.var = 'n')[order(n, value == ''),
     .(value, variable = names(temp)[1:5]), n], n ~ variable)[, n := NULL][]
like image 58
akrun Avatar answered Oct 31 '22 15:10

akrun


There may be a more elegant way, but this works:

library(plyr)

x = apply(temp,1,function(x) {t(as.matrix(unname(x[nchar(x)>0])))})
x = do.call(rbind.fill.matrix, x)
x[is.na(x)]=''
colnames(x) = colnames(temp)[1:ncol(x)]
x = as.data.frame(x)

Output:

  col_1 col_2 col_3 col_4 col_5
1 84484 V2632 J1170            
2 P9603 80061 82310 84305 86708
3 62311 J0690                  
4 80061 80061                  
5 11040 A0428                  
6 80053 85025 86357

Basically, find all entries per row with nchar(x)>0 and row bind them using rbind.fill.matrix, so they are left-aligned. Then replace the NA's with '', replace the column names with the original ones (take into account that there may be less columns left), and convert to dataframe.

Hope this helps!

like image 45
Florian Avatar answered Oct 31 '22 14:10

Florian