Count occurrences of item in one dataframe in another

Question

I am currently running into a problem and hoping that someone could assist. Currently have 2 dataframes of items which are hundreds of thousands of lines long. (one has over 200k and one over 180k). the larger of the 2 dataframes is going to contain unique values of users, while the smaller one does not so for example:

df1:
user1
user2
user3
user4
user5

df2:
user1
user1
user5
user4
user5
user5

What i need to do is take each user from df1 and efficiently see if it is in df2 and how many times it occurs.

Thanks!

BENY · Accepted Answer

Using value_counts

df1['Newcount']=df1['df1:'].map(df2['df2:'].value_counts())
df1
Out[117]: 
    df1:  Newcount
0  user1       2.0
1  user2       NaN
2  user3       NaN
3  user4       1.0
4  user5       3.0

Ami Tavory · Answer

Assuming the relevant column in each DataFrame is called 'user', you can use

pd.merge(
    df1,
    df2.user.groupby(df2.user).count(),
    left_on='user',
    right_index=True,
    how='left')

Explanation:

The groupby + count will find the number of occurrences of each user. It will create a DataFrame whose index is the user, and the value is the count.
The merge left-merges the resulting DataFrame onto df1.

Count occurrences of item in one dataframe in another

Tags:

python

python-3.x

pandas

dataframe

Sam L

2 Answers

BENY

Ami Tavory

Recent Activity

Donate For Us

Count occurrences of item in one dataframe in another

Tags:

python

python-3.x

pandas

dataframe

Sam L

2 Answers

BENY

Ami Tavory

Related questions

Recent Activity

Donate For Us