Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine two data frames of the same size one column after each other

Tags:

dataframe

r

cbind

I have two datasets both of the same size [132,450000]. One with values and another with p-values corresponding to those values. Now I want to combine those two datasets so that I have 1 large dataframe [264,450000] with the column with values followed by the column with the corresponding p-values. The rownames are exactly the same and the column names are like: sample1 in df1 and sample1_pval in df2

For example I have two dataframes likes this

> df1
    x y
cg1 1 a
cg2 2 b
cg3 3 c
cg4 4 d
cg5 5 e

> df2
     x_pval y_pval 
cg1   6      f
cg2   7      g
cg3   8      h
cg4   9      i
cg5  10      j

And I want to merge them with this order: 1st column of df1 followed by 1st column of df2 followed by 2nd column of df1 followed by 2nd column of df2 etc etc.....

So then it will look like this:

> df
           x       x_pval    y        y_pval
cg1        1        6        a        f
cg2        2        7        b        g
cg3        3        8        c        h
cg4        4        9        d        i
cg5        5       10        e        j

I want to keep the columnnames but the rownames I can add later since they are the same in both dataframes. Since I'm working with a large dataset I dont want to type in all the columns and use "cbind". And I couldn't find a code for "merge" that does one column from each dataset at a time......

Is there a formula or package that does this?

Anyone who can help me out?

like image 670
Fleur Peters Avatar asked Sep 12 '17 09:09

Fleur Peters


2 Answers

Another option would be to concatenate the sequence of columns of both datasets, order and then cbind

cbind(df1, df2)[order(c(seq_along(df1), seq_along(df2)))]
#    x x_pval y y_pval
#cg1 1      6 a      f
#cg2 2      7 b      g
#cg3 3      8 c      h
#cg4 4      9 d      i
#cg5 5     10 e      j
like image 121
akrun Avatar answered Nov 23 '22 21:11

akrun


One idea is to cbind the data frames and order on column names prefixes, i.e.

dd <- cbind(df1, df2)
dd[order(sub('_.*', '', names(dd)))]

which gives,

    x x_pval y y_pval
cg1 1      6 a      f
cg2 2      7 b      g
cg3 3      8 c      h
cg4 4      9 d      i
cg5 5     10 e      j

If your columns are always structured as your example then this will also work,

data.frame(dd[c(TRUE, FALSE)], dd[c(FALSE, TRUE)]) #dd taken from above
like image 32
Sotos Avatar answered Nov 23 '22 23:11

Sotos