Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - Interleave / Zip two DataFrames by row

Suppose I have two dataframes:

>> df1

   0  1  2
0  a  b  c
1  d  e  f

>> df2

   0  1  2
0  A  B  C
1  D  E  F

How can I interleave the rows? i.e. get this:

>> interleaved_df

   0  1  2
0  a  b  c
1  A  B  C
2  d  e  f
3  D  E  F

(Note my real DFs have identical columns, but not the same number of rows).


What I've tried

inspired by this question (very similar, but asks on columns):

import pandas as pd
from itertools import chain, zip_longest

df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])

concat_df = pd.concat([df1,df2])

new_index = chain.from_iterable(zip_longest(df1.index, df2.index))
# new_index now holds the interleaved row indices

interleaved_df = concat_df.reindex(new_index)

ValueError: cannot reindex from a duplicate axis

The last call fails because df1 and df2 have some identical index values (which is also the case with my real DFs).

Any ideas?

like image 755
OmerB Avatar asked Aug 08 '17 10:08

OmerB


1 Answers

You can sort the index after concatenating and then reset the index i.e

import pandas as pd

df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])

concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)

Output :

   0  1  2
0  a  b  c
1  A  B  C
2  d  e  f
3  D  E  F

EDIT (OmerB) : Incase of keeping the order regardless of the index value then.

import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()

concat_df = pd.concat([df1,df2]).sort_index().set_index('index')
like image 52
Bharath Avatar answered Oct 20 '22 04:10

Bharath