Suppose I have two dataframes d1
and d2
d1 = pd.DataFrame(np.ones((3, 3), dtype=int), list('abc'), [0, 1, 2])
d2 = pd.DataFrame(np.zeros((3, 2), dtype=int), list('abc'), [3, 4])
d1
0 1 2
a 1 1 1
b 1 1 1
c 1 1 1
d2
3 4
a 0 0
b 0 0
c 0 0
What is an easy and generalized way to interweave two dataframes' columns. We can assume that the number of columns in d2
is always one less than the number of columns in d1
. And, the indices are the same.
I want this:
pd.concat([d1[0], d2[3], d1[1], d2[4], d1[2]], axis=1)
0 3 1 4 2
a 1 0 1 0 1
b 1 0 1 0 1
c 1 0 1 0 1
The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
Key Points Pandas' merge and concat can be used to combine subsets of a DataFrame, or even data from different files. join function combines DataFrames based on index or column. Joining two DataFrames can be done in multiple ways (left, right, and inner) depending on what data must be in the final DataFrame.
We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .
Using pd.concat
to combine the DataFrames, and toolz.interleave
reorder the columns:
from toolz import interleave
pd.concat([d1, d2], axis=1)[list(interleave([d1, d2]))]
The resulting output is as expected:
0 3 1 4 2
a 1 0 1 0 1
b 1 0 1 0 1
c 1 0 1 0 1
Here's one NumPy approach -
def numpy_interweave(d1, d2):
c1 = list(d1.columns)
c2 = list(d2.columns)
N = (len(c1)+len(c2))
cols = [None]*N
cols[::2] = c1
cols[1::2] = c2
out_dtype = np.result_type(d1.values.dtype, d2.values.dtype)
out = np.empty((d1.shape[0],N),dtype=out_dtype)
out[:,::2] = d1.values
out[:,1::2] = d2.values
df_out = pd.DataFrame(out, columns=cols, index=d1.index)
return df_out
Sample run -
In [346]: d1
Out[346]:
x y z
a 6 7 4
b 3 5 6
c 4 6 2
In [347]: d2
Out[347]:
p q
a 4 2
b 7 7
c 7 2
In [348]: numpy_interweave(d1, d2)
Out[348]:
x p y q z
a 6 4 7 2 4
b 3 7 5 7 6
c 4 7 6 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With