Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas sort by Time and group by user ID

I am loading a CSV file with pandas. It has three columns: a column with date and time, a column with a user id, and another 'campaignID'. Example rows:

date                 user_id              campaign_id
2018-01-10 0:21:09   151312395            GOOGLE
2018-01-10 0:21:19   151312395            GOOGLE
2018-01-10 0:21:32   151312395            GOOGLE 

I want to group the data by the user id, and then for each user id group the rows by time and the campaign ID, it should look as follows.

user_id              date                           ad_campaign
151312395            2018-01-10 0:21:09             GOOGLE
                     2018-01-10 0:21:19             GOOGLE
                     2018-01-10 0:21:32             GOOGLE 

This is what I have made until now: import pandas as pd import numpy as np import datetime

def dateparse(time_in_secs):
    return datetime.datetime.fromtimestamp(float(time_in_secs))
columnnames = ['date','user_id', 'ad_campaign']
columnnames, sep='\t' ,usecols=[0,1,3],index_col = 'date')
df=pd.read_csv(r'C:\Users\L\Desktop\Data.csv' , 
     sep='\t',names = columnnames, usecols=[0,1,3], 
    parse_dates=True,date_parser=dateparse)
df.date = pd.to_datetime(df.date)
df = df.sort_values(by = 'date')
g = df.groupby('user_id')['ad_campaign']
print(g)

This gives the following output:

<pandas.core.groupby.SeriesGroupBy object at 0x04EF26F0>
[Finished in 0.6s]

Why doesnt the print provide the sorted columns?

like image 714
Laila Van Ments Avatar asked Apr 26 '18 14:04

Laila Van Ments


People also ask

How do you group by and sort the data in pandas?

To group Pandas dataframe, we use groupby(). To sort grouped dataframe in ascending or descending order, use sort_values(). The size() method is used to get the dataframe size.

How do you sort values by group in Python?

Sort within Groups of groupby() Result in DataFrameBy using DataFrame. sort_values() , you can sort DataFrame in ascending or descending order, before you use this first group the DataFrame rows by using DataFrame. groupby() method. Note that groupby preserves the order of rows within each group.

How do I sort pandas datetime?

We will be using the sort_values() method to sort our dataset and the attribute that we will pass inside the function is the column name using which we want to sort our DataFrame.

Does pandas Groupby keep order?

Groupby preserves the order of rows within each group.


1 Answers

Firstly, if you are doing groupby, you don't need to sort the column explicitly.

You can do:

Method 1:

df.date = pd.to_datetime(df.date)
g = df.groupby(['user_id','date'])['ad_campaign']
print(g.first())

Method 2:

df.set_index(['user_id','date']).sort_index()
like image 186
YOLO Avatar answered Nov 15 '22 06:11

YOLO