I have a data set like this
temp <- structure(list(col_1 = c("", "P9603", "", "", "11040",
"80053"), col_2 = c("84484", "80061", "", "80061", "A0428", "85025"
), col_3 = c("V2632", "82310", "", "", "", "86357"), col_4 = c("J1170",
"84305", "62311", "80061", "", ""), col_5 = c("", "86708", "J0690",
"", "", "")), .Names = c("col_1", "col_2", "col_3", "col_4",
"col_5"), class = c("data.table", "data.frame"))
col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170
2: P9603 80061 82310 84305 86708
3: 62311 J0690
4: 80061 80061
5: 11040 A0428
6: 80053 85025 86357
Is there a possibility to shift the columns like this
col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170 #LEFT SHIFT 1
2: P9603 80061 82310 84305 86708 #NO CHANGE
3: 62311 J0690 #LEFT SHIFT 3
4: 80061 80061 #LEFT SHIFT 1 FOR FIRST ITEM,
#LEFT SHIFT 2 FOR 2ND ITEM
5: 11040 A0428 #NO CHANGE
6: 80053 85025 86357 #NO CHANGE
I am shifting columns left, if the value on left is empty
Thus, to convert columns of an R data frame into rows we can use transpose function t. For example, if we have a data frame df with five columns and five rows then we can convert the columns of the df into rows by using as. data. frame(t(df)).
To move a column to first in the dataframe, we use relocate() with the column name we want to move. This will move the column of interest to the first column. We can also move the column of interest to a location after another column in the dataframe.
Here is an option using data.table
. Grouped by the sequence of rows, unlist
the Subset of data.table (.SD
), order
by the logical vector (un==''
), convert to list
and then set the names with the original column names after removing the 'grp' column
setnames(temp[, {un <- unlist(.SD); as.list(un[order(un=='')])},
.(grp = 1:nrow(temp))][, grp := NULL], names(temp))[]
# col_1 col_2 col_3 col_4 col_5
#1: 84484 V2632 J1170
#2: P9603 80061 82310 84305 86708
#3: 62311 J0690
#4: 80061 80061
#5: 11040 A0428
#6: 80053 85025 86357
Or another option is to melt
into long format after creating a sequence column, then dcast
it to wide format
dcast(melt(temp[, n := seq_len(.N)], id.var = 'n')[order(n, value == ''),
.(value, variable = names(temp)[1:5]), n], n ~ variable)[, n := NULL][]
There may be a more elegant way, but this works:
library(plyr)
x = apply(temp,1,function(x) {t(as.matrix(unname(x[nchar(x)>0])))})
x = do.call(rbind.fill.matrix, x)
x[is.na(x)]=''
colnames(x) = colnames(temp)[1:ncol(x)]
x = as.data.frame(x)
Output:
col_1 col_2 col_3 col_4 col_5
1 84484 V2632 J1170
2 P9603 80061 82310 84305 86708
3 62311 J0690
4 80061 80061
5 11040 A0428
6 80053 85025 86357
Basically, find all entries per row with nchar(x)>0
and row bind them using rbind.fill.matrix, so they are left-aligned. Then replace the NA
's with ''
, replace the column names with the original ones (take into account that there may be less columns left), and convert to dataframe.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With