I have two pandas dataframes, and I would like to combine each second dataframe row with each first dataframe row like this:
First:
val1 val2
1 2
0 0
2 1
Second:
l1 l2
a a
b c
Result (expected result size = len(first) * len(second)):
val1 val2 l1 l2
1 2 a a
1 2 b c
0 0 a a
0 0 b c
2 1 a a
2 1 b b
They have no same index.
Regards, Secau
Step 1: First of all, import the library Pandas. Step 2: Then, obtain the datasets on which you want to perform a cartesian product. Step 3: Further, use a merge function to perform the cartesian product on the datasets obtained. Step 4: Finally, print the cartesian product obtained.
In Pandas, there are parameters to perform left, right, inner or outer merge and join on two DataFrames or Series. However there's no possibility as of now to perform a cross join to merge or join two methods using how="cross" parameter. # merge on that key.
Both join and merge can be used to combines two dataframes but the join method combines two dataframes on the basis of their indexes whereas the merge method is more versatile and allows us to specify columns beside the index to join on for both dataframes.
A Cross Join is a type of join that allows you to produce a Cartesian Product of rows in two or more tables. In other words, it combines rows from a first table with each row from a second table. The following table illustrates the result of Cross Join , when joining the Table df1 to another Table df2.
Create a surrogate key to do a cartesian join between them...
import pandas as pd
df1 = pd.DataFrame({'A': [1, 0, 2],
'B': [2, 0, 1],
'tmp': 1})
df2 = pd.DataFrame({'l1': ['a', 'b'],
'l2': ['a', 'c'],
'tmp': 1})
print pd.merge(df1, df2, on='tmp', how='outer')
Result:
A B tmp l1 l2
0 1 2 1 a a
1 1 2 1 b c
2 0 0 1 a a
3 0 0 1 b c
4 2 1 1 a a
5 2 1 1 b c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With