Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate rows of two dataframes in pandas

I need to concatenate two dataframes df_a anddf_b having equal number of rows (nRow) one after another without any consideration of keys. This function is similar to cbind in R programming language. The number of columns in each dataframe may be different.

The resultant dataframe will have the same number of rows nRow and number of columns equal to the sum of number of columns in both the dataframes. In othe words, this is a blind columnar concatenation of two dataframes.

import pandas as pd dict_data = {'Treatment': ['C', 'C', 'C'], 'Biorep': ['A', 'A', 'A'], 'Techrep': [1, 1, 1], 'AAseq': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'mz':[500.0, 500.5, 501.0]} df_a = pd.DataFrame(dict_data) dict_data = {'Treatment1': ['C', 'C', 'C'], 'Biorep1': ['A', 'A', 'A'], 'Techrep1': [1, 1, 1], 'AAseq1': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'inte1':[1100.0, 1050.0, 1010.0]} df_b = pd.DataFrame(dict_data) 
like image 231
user1140126 Avatar asked Jan 25 '15 10:01

user1140126


1 Answers

call concat and pass param axis=1 to concatenate column-wise:

In [5]:  pd.concat([df_a,df_b], axis=1) Out[5]:         AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \ 0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1    1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1    2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1       Treatment1  inte1   0          C   1100   1          C   1050   2          C   1010   

There is a useful guide to the various methods of merging, joining and concatenating online.

For example, as you have no clashing columns you can merge and use the indices as they have the same number of rows:

In [6]:  df_a.merge(df_b, left_index=True, right_index=True) Out[6]:         AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \ 0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1    1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1    2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1       Treatment1  inte1   0          C   1100   1          C   1050   2          C   1010   

And for the same reasons as above a simple join works too:

In [7]:  df_a.join(df_b) Out[7]:         AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \ 0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1    1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1    2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1       Treatment1  inte1   0          C   1100   1          C   1050   2          C   1010   
like image 102
EdChum Avatar answered Sep 27 '22 18:09

EdChum