Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas: How many values of one series are in another?

Tags:

python

pandas

I have two pandas dataframes:

df1 = pd.DataFrame(
    {
    "col1": ["1","2",np.nan,"3"],
    }
)

df2 = pd.DataFrame(
    {
    "col1": [2.0,3.0,4.0,np.nan],
    }
)

I would like to know how many values of df1.col1 exist in df2.col1. In this case it should be 2 as I want "2" and 2.0 to be seen as equal.

I do have a working solution, but because I think I'll need this more often (and for learning purposes, of cause), I wanted to ask you if there is a more comfortable way to do that.

df1.col1[df1.col1.notnull()].isin(df2.col1[df2.col1.notnull()].astype(int).astype(str)).value_counts()

enter image description here

like image 454
Julian Avatar asked Sep 01 '25 01:09

Julian


1 Answers

Use Series.dropna and convert to floats, if working with integers and missing values:

a = df1.col1.dropna().astype(float).isin(df2.col1.dropna()).value_counts()

Or:

a = df1.col1.dropna().isin(df2.col1.dropna().astype(int).astype(str)).value_counts()
like image 168
jezrael Avatar answered Sep 02 '25 13:09

jezrael