I currently have two dataframes that have two matching columns. For example :
Data frame 1 with columns : A,B,C
Data frame 2 with column : A
I want to keep all lines in the first dataframe that have the values that the A contains. For example if df2 and df1 are:
df1
A B C
0 1 3
4 2 5
6 3 1
8 0 0
2 1 1
df2
Α
4
6
1
So in this case, I want to only keep the second and third line of df1. I tried doing it like this, but it didnt work since both dataframes are pretty big:
for index, row in df1.iterrows():
counter = 0
for index2,row2 in df2.iterrows():
if row["A"] == row2["A"]:
counter = counter + 1
if counter == 0:
df2.drop(index, inplace=True)
Use isin
to test for membership:
In [176]:
df1[df1['A'].isin(df2['A'])]
Out[176]:
A B C
1 4 2 5
2 6 3 1
Or use the merge method:
df1= pandas.DataFrame([[0,1,3],[4,2,5],[6,3,1],[8,0,0],[2,1,1]], columns = ['A', 'B', 'C'])
df2= pandas.DataFrame([4,6,1], columns = ['A'])
df2.merge(df1, on = 'A')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With