Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

assign unique ID to each unique value in group after pandas groupby

I have a DataFrame as following.

df = pd.DataFrame({'col1': ['a','b','c','c','d','e','a','h','i','a'],'col2':['3:00','3:00','4:00','4:00','3:00','5:00','5:00','3:00','3:00','2:00']})

df
Out[83]: 
  col1  col2
0    a  3:00
1    b  3:00
2    c  4:00
3    c  4:00
4    d  3:00
5    e  5:00
6    a  5:00
7    h  3:00
8    i  3:00
9    a  2:00    

What I'd like to do is groupby 'col1' and assign a unique ID to different values in col2 as following:

col1  col2  ID
 a    2:00   0
 a    3:00   1
 a    5:00   2
 b    3:00   0
 c    4:00   0
 c    4:00   0
 ... 

I tried to use pd.Categorical but can't quite get to where I wanted to be.

like image 840
user4279562 Avatar asked Jul 13 '17 16:07

user4279562


People also ask

How do you get unique values in Groupby pandas?

To count unique values per groups in Python Pandas, we can use df. groupby('column_name'). count().

How do you keep unique values in pandas?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

How extract unique values from multiple columns in pandas?

Pandas series aka columns has a unique() method that filters out only unique values from a column. The first output shows only unique FirstNames. We can extend this method using pandas concat() method and concat all the desired columns into 1 single column and then find the unique of the resultant column.

Does Groupby preserve index?

The Groupby Rolling function does not preserve the original index and so when dates are the same within the Group, it is impossible to know which index value it pertains to from the original dataframe.


1 Answers

we can use pd.factorize() method:

In [170]: df['ID'] = df.groupby('col1')['col2'].transform(lambda x: pd.factorize(x)[0])

In [171]: df
Out[171]:
  col1  col2  ID
0    a  3:00   0
1    b  3:00   0
2    c  4:00   0
3    c  4:00   0
4    d  3:00   0
5    e  5:00   0
6    a  5:00   1
7    h  3:00   0
8    i  3:00   0
9    a  2:00   2
like image 95
MaxU - stop WAR against UA Avatar answered Nov 03 '22 06:11

MaxU - stop WAR against UA