I have several dataframes with columns: coupon_id and rating.
I want to merge this dataframes and get one dataframe with all coupon_id and rating as the sum of all ratings for this coupon_id in all dataframes.
For example. Suppose I have 2 dataframes:
| coupon_id | rating | |:-----------|------------:| | 1 | 40 | | 2 | 60 | | 3 | 50 |
| coupon_id | rating | |:-----------|------------:| | 4 | 70 | | 2 | 80 | | 3 | 60 |
As result I want to get this dataframe:
| coupon_id | rating | |:-----------|------------:| | 1 | 40 | | 2 | 140 | | 3 | 110 | | 4 | 70 |
For this problem I use this code, it works, but it is inefficient
similar_users_ratings = pd.DataFrame(columns=['coupon_id', 'rating'])
for similarUser in most_similar_users:
similar_user_ratings = self.ratingData.loc[self.ratingData['patient_id'] == similarUser[0], :].copy()
similar_user_ratings.loc[:, 'rating'] = similar_user_ratings.loc[:, 'rating'].apply(lambda x: int(x) * similarUser[1])
del similar_user_ratings['patient_id']
similar_users_ratings = similar_users_ratings.merge(similar_user_ratings, on='coupon_id', how='outer')
similar_users_ratings['rating_y'].fillna(.0, inplace=True)
similar_users_ratings['rating_x'].fillna(.0, inplace=True)
similar_users_ratings['rating'] = similar_users_ratings['rating_x'] + similar_users_ratings['rating_y']
del similar_users_ratings['rating_y']
del similar_users_ratings['rating_x']
How can I simplify this piece of code? Thanks.
In fact I have several dataframes such as:
coupon_id rating
69 12 1
coupon_id rating
101 37 1
coupon_id rating
428 11 1
coupon_id rating
1133 11 1
Desired dataset:
coupon_id rating
12 1
37 1
11 2
UPDATE:
In [46]: d1
Out[46]:
coupon_id rating
69 12 1
In [47]: d2
Out[47]:
coupon_id rating
101 37 1
In [48]: d3
Out[48]:
coupon_id rating
428 11 1
In [49]: d4
Out[49]:
coupon_id rating
1133 11 1
In [50]: pd.concat([d1,d2,d3,d4],ignore_index=True).groupby('coupon_id', as_index=False)['rating'].sum(
Out[50]:
coupon_id rating
0 11 2
1 12 1
2 37 1
OLD answer:
In [219]: d1.set_index('coupon_id').add(d2.set_index('coupon_id'), fill_value=0) \
.reset_index()
Out[219]:
coupon_id rating
0 1 40.0
1 2 140.0
2 3 110.0
3 4 70.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With