Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas python - matching values

I currently have two dataframes that have two matching columns. For example :

Data frame 1 with columns : A,B,C

Data frame 2 with column : A

I want to keep all lines in the first dataframe that have the values that the A contains. For example if df2 and df1 are:

df1

A B C
0 1 3
4 2 5
6 3 1
8 0 0
2 1 1

df2
Α
4
6
1

So in this case, I want to only keep the second and third line of df1. I tried doing it like this, but it didnt work since both dataframes are pretty big:

for index, row in df1.iterrows():
    counter = 0
    for index2,row2 in df2.iterrows():
        if row["A"] == row2["A"]:
            counter = counter + 1
    if counter == 0:
        df2.drop(index, inplace=True)
like image 626
user1823812 Avatar asked Jan 08 '23 00:01

user1823812


2 Answers

Use isin to test for membership:

In [176]:
df1[df1['A'].isin(df2['A'])]

Out[176]:
   A  B  C
1  4  2  5
2  6  3  1
like image 119
EdChum Avatar answered Jan 10 '23 07:01

EdChum


Or use the merge method:

df1= pandas.DataFrame([[0,1,3],[4,2,5],[6,3,1],[8,0,0],[2,1,1]], columns = ['A', 'B', 'C'])
df2= pandas.DataFrame([4,6,1], columns = ['A'])
df2.merge(df1, on = 'A')
like image 44
dlm Avatar answered Jan 10 '23 06:01

dlm