Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas two dataframe cross join [duplicate]

Tags:

I can't find anything about cross join include the merge/join or some other. I need deal with two dataframe using {my function} as myfunc . the equivalent of :

{     for itemA in df1.iterrows():            for itemB in df2.iterrows():                        t["A"] = myfunc(itemA[1]["A"],itemB[1]["A"])  }       

the equivalent of :

{  select myfunc(df1.A,df2.A),df1.A,df2.A from df1,df2; } 

but I need more efficient solution: if used apply i will be how to implement them thx;^^

like image 304
Vity Lin Avatar asked Dec 08 '15 17:12

Vity Lin


People also ask

How do you cross join two DataFrames in Python?

In Pandas, there are parameters to perform left, right, inner or outer merge and join on two DataFrames or Series. However there's no possibility as of now to perform a cross join to merge or join two methods using how="cross" parameter. # merge on that key. # on the key and drop it.

How avoid duplicates in pandas merge?

merge() function to join the two data frames by inner join. Now, add a suffix called 'remove' for newly joined columns that have the same name in both data frames. Use the drop() function to remove the columns with the suffix 'remove'. This will ensure that identical columns don't exist in the new dataframe.

How do I merge two DataFrames in pandas and remove duplicates?

To concatenate DataFrames, use the concat() method, but to ignore duplicates, use the drop_duplicates() method.

What is cross merge pandas?

A Cross Join is a type of join that allows you to produce a Cartesian Product of rows in two or more tables. In other words, it combines rows from a first table with each row from a second table. The following table illustrates the result of Cross Join , when joining the Table df1 to another Table df2.


2 Answers

Create a common 'key' to cross join the two:

df1['key'] = 0 df2['key'] = 0  df1.merge(df2, on='key', how='outer') 
like image 115
A.Kot Avatar answered Sep 26 '22 19:09

A.Kot


For the cross product, see this question.

Essentially, you have to do a normal merge but give every row the same key to join on, so that every row is joined to each other across the frames.

You can then add a column to the new frame by applying your function:

new_df = pd.merge(df1, df2, on=key) new_df.new_col = new_df.apply(lambda row: myfunc(row['A_x'], row['A_y']), axis=1) 

axis=1 forces .apply to work across the rows. 'A_x' and 'A_y' will be the default column names in the resulting frame if the merged frames share a column like in your example above.

like image 26
leroyJr Avatar answered Sep 24 '22 19:09

leroyJr