Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create column of value_counts in Pandas dataframe

Tags:

python

pandas

I want to create a count of unique values from one of my Pandas dataframe columns and then add a new column with those counts to my original data frame. I've tried a couple different things. I created a pandas series and then calculated counts with the value_counts method. I tried to merge these values back to my original dataframe, but I the keys that I want to merge on are in the Index(ix/loc).

Color Value Red   100 Red   150 Blue  50 

I want to return something like:

Color Value Counts Red   100   2 Red   150   2  Blue  50    1 
like image 784
user2592989 Avatar asked Jul 17 '13 20:07

user2592989


People also ask

What is Value_counts () in pandas?

Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.

What is the difference between Value_counts and count in pandas?

count() should be used when you want to find the frequency of valid values present in columns with respect to specified col . . value_counts() should be used to find the frequencies of a series.

How do I add a column to a Pandas Dataframe?

Pandas Data frame is a two-dimensional data structure that stores data in rows and columns structure. You can add column to pandas dataframe using the df. insert(col_index_position, “Col_Name”, Col_Values_As_List, True) statement.


1 Answers

df['Counts'] = df.groupby(['Color'])['Value'].transform('count') 

For example,

In [102]: df = pd.DataFrame({'Color': 'Red Red Blue'.split(), 'Value': [100, 150, 50]})  In [103]: df Out[103]:    Color  Value 0   Red    100 1   Red    150 2  Blue     50  In [104]: df['Counts'] = df.groupby(['Color'])['Value'].transform('count')  In [105]: df Out[105]:    Color  Value  Counts 0   Red    100       2 1   Red    150       2 2  Blue     50       1 

Note that transform('count') ignores NaNs. If you want to count NaNs, use transform(len).


To the anonymous editor: If you are getting an error while using transform('count') it may be due to your version of Pandas being too old. The above works with pandas version 0.15 or newer.

like image 117
unutbu Avatar answered Nov 05 '22 10:11

unutbu