Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interweave two dataframes

Suppose I have two dataframes d1 and d2

d1 = pd.DataFrame(np.ones((3, 3), dtype=int), list('abc'), [0, 1, 2])
d2 = pd.DataFrame(np.zeros((3, 2), dtype=int), list('abc'), [3, 4])

d1

   0  1  2
a  1  1  1
b  1  1  1
c  1  1  1

d2

   3  4
a  0  0
b  0  0
c  0  0

What is an easy and generalized way to interweave two dataframes' columns. We can assume that the number of columns in d2 is always one less than the number of columns in d1. And, the indices are the same.

I want this:

pd.concat([d1[0], d2[3], d1[1], d2[4], d1[2]], axis=1)

   0  3  1  4  2
a  1  0  1  0  1
b  1  0  1  0  1
c  1  0  1  0  1
like image 442
piRSquared Avatar asked Jul 26 '17 16:07

piRSquared


People also ask

How do I merge two data frames?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

How do I merge two DataFrames based on a column?

Key Points Pandas' merge and concat can be used to combine subsets of a DataFrame, or even data from different files. join function combines DataFrames based on index or column. Joining two DataFrames can be done in multiple ways (left, right, and inner) depending on what data must be in the final DataFrame.

How do I join multiple DataFrames in pandas?

We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .


Video Answer


2 Answers

Using pd.concat to combine the DataFrames, and toolz.interleave reorder the columns:

from toolz import interleave

pd.concat([d1, d2], axis=1)[list(interleave([d1, d2]))]

The resulting output is as expected:

   0  3  1  4  2
a  1  0  1  0  1
b  1  0  1  0  1
c  1  0  1  0  1
like image 185
root Avatar answered Sep 29 '22 03:09

root


Here's one NumPy approach -

def numpy_interweave(d1, d2):
    c1 = list(d1.columns)
    c2 = list(d2.columns)
    N = (len(c1)+len(c2))
    cols = [None]*N
    cols[::2] = c1
    cols[1::2] = c2

    out_dtype = np.result_type(d1.values.dtype, d2.values.dtype)
    out = np.empty((d1.shape[0],N),dtype=out_dtype)
    out[:,::2] = d1.values
    out[:,1::2] = d2.values

    df_out = pd.DataFrame(out, columns=cols, index=d1.index)
    return df_out

Sample run -

In [346]: d1
Out[346]: 
   x  y  z
a  6  7  4
b  3  5  6
c  4  6  2

In [347]: d2
Out[347]: 
   p  q
a  4  2
b  7  7
c  7  2

In [348]: numpy_interweave(d1, d2)
Out[348]: 
   x  p  y  q  z
a  6  4  7  2  4
b  3  7  5  7  6
c  4  7  6  2  2
like image 44
Divakar Avatar answered Sep 29 '22 03:09

Divakar