I have a data frame which is a result of another command. This data frame has only one row with around 40000 entries. My problem is that 3 columns are one connected set of data. Therefore I want to split the row after every third column and transport this as a new row. Example:
Create a test data frame:
df=as.data.frame(matrix(seq(1:12), ncol=12, nrow=1))
Now I have a data frame which looks like this.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
1 2 3 4 5 6 7 8 9 10 11 12
But I need it like this:
V1 V2 V3
1 2 3
4 5 6
7 8 9
10 11 12
How can I realise this?
Select and copy the needed range. Right-click on a cell where you want to convert rows to columns. Select the Paste Transpose option to rotate rows to columns.
First, we used the paste() function from base R. Using this function, we combined two and three columns, changed the separator from whitespaces to hyphen (“-”). Second, we used the str_() function to merge columns. Third, we used the unite() function.
To convert multiple columns into single column in an R data frame, we can use unlist function. For example, if we have data frame defined as df and contains four columns then the columns of df can be converted into a single by using data. frame(x=unlist(df)).
Try
as.data.frame(matrix(unlist(df, use.names=FALSE),ncol=3, byrow=TRUE))
# V1 V2 V3
#1 1 2 3
#2 4 5 6
#3 7 8 9
#4 10 11 12
Or you could directly use matrix
on df
as.data.frame(matrix(df, ncol=3, byrow=TRUE))
Could also try using dim<-
(just for general knowledge)
as.data.frame(t(`dim<-`(unlist(df), c(3, 4))))
# V1 V2 V3
# 1 1 2 3
# 2 4 5 6
# 3 7 8 9
# 4 10 11 12
This turned out to be faster than I expected (though still not as fast as the obvious approach that @akrun took), so I'm going to post this (like David) "just for general knowledge". (Plus, "data.table" all the things.) :-)
Create a data.table
with three columns:
Once you have that, you can use dcast.data.table
to get the output you mention (plus a bonus column).
For point number 2 above, we can easily define a function like the following to make the process of creating groups easy:
groupMaker <- function(vecLen, perGroup) {
(0:(vecLen-1) %/% perGroup) + 1
}
Then we can use it as follows:
dcast.data.table(
data.table(value = unlist(df, use.names = FALSE),
row = groupMaker(ncol(df), 3),
col = 1:3),
row ~ col)
# row 1 2 3
# 1: 1 1 2 3
# 2: 2 4 5 6
# 3: 3 7 8 9
# 4: 4 10 11 12
Now, you mention that you are actually dealing with a single-row ~ 40K column data.frame
(I'll assume it to be 39,999 columns since that's nicely divisible by 3 and I don't want to break the other answers).
Keeping that in mind, here are some (useless) benchmarks (useless because we're talking milliseconds here, really).
set.seed(1)
S <- sample(20, 39999, TRUE)
S <- data.frame(t(S))
funAM <- function(indf) {
dcast.data.table(
data.table(value = unlist(indf, use.names = FALSE),
row = groupMaker(ncol(indf), 3),
col = 1:3),
row ~ col)
}
funDA <- function(indf) {
as.data.frame(t(`dim<-`(unlist(indf), c(3, ncol(indf)/3))))
}
funAK <- function(indf) as.data.frame(matrix(indf, ncol=3, byrow=TRUE))
library(microbenchmark)
microbenchmark(funAM(S), funDA(S), funAK(S))
# Unit: milliseconds
# expr min lq mean median uq max neval
# funAM(S) 18.487001 18.813297 22.105766 18.999891 19.455812 50.25876 100
# funDA(S) 37.187177 37.450893 40.393893 37.870683 38.869726 94.20128 100
# funAK(S) 5.018571 5.149758 5.929944 5.271679 5.536449 26.93281 100
Where this might be useful would be in cases where the number of desired columns and your number of input columns are not nicely divisible by each other.
For example, try the following sample data:
set.seed(1)
S2 <- sample(20, 40000, TRUE)
S2 <- data.frame(t(S))
With this sample data:
funAM
would give you a warning
but would correctly give you the last two columns of the last row as NA
.funAK
would give you a warning
but would (presumably) incorrectly recycle values in the last row.funDA
would just give you an error
.I still think you should just fix the problem at the source though :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With