Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging rows and summing values by date in Python [duplicate]

I have a data frame that has multiple duplicated instances, sorted by date. It looks like this:

enter image description here

And I am trying to merge the rows by date for matching "Keywords", and sum the "Views" count.

The result I am looking to obtain is the following:

enter image description here

Could anyone hint me how I could achieve that in Python? Thank you.

The data frame:

 df = pd.DataFrame([["3/8/14", "adapter", 2], ["3/8/14", "adapter", 5], ["3/8/14", "charger", 1],
                   ["13/8/14", "adapter", 1], ["13/8/14", "battery-case", 0]],
                  columns=['Date', 'Keyword', 'Views'])
like image 354
Notna Avatar asked Dec 19 '22 01:12

Notna


1 Answers

You want a groupby!

import pandas as pd

df = pd.DataFrame([["3/8/14", "adapter", 2], ["3/8/14", "adapter", 5], ["3/8/14", "charger", 1],
                   ["13/8/14", "adapter", 1], ["13/8/14", "battery-case", 0]],
                  columns=['Date', 'Keyword', 'Views'])
print df

#       Date       Keyword  Views
# 0   3/8/14       adapter      2
# 1   3/8/14       adapter      5
# 2   3/8/14       charger      1
# 3  13/8/14       adapter      1
# 4  13/8/14  battery-case      0

df2 = df.groupby(['Date','Keyword'],as_index=False).agg({'Views': 'sum'})

print df2

#       Date       Keyword  Views
# 0  13/8/14       adapter      1
# 1  13/8/14  battery-case      0
# 2   3/8/14       adapter      7
# 3   3/8/14       charger      1
like image 90
fenrisulfr Avatar answered Dec 27 '22 02:12

fenrisulfr