I want to create a count of unique values from one of my Pandas dataframe columns and then add a new column with those counts to my original data frame. I've tried a couple different things. I created a pandas series and then calculated counts with the value_counts method. I tried to merge these values back to my original dataframe, but I the keys that I want to merge on are in the Index(ix/loc).
Color Value Red 100 Red 150 Blue 50
I want to return something like:
Color Value Counts Red 100 2 Red 150 2 Blue 50 1
Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.
count() should be used when you want to find the frequency of valid values present in columns with respect to specified col . . value_counts() should be used to find the frequencies of a series.
Pandas Data frame is a two-dimensional data structure that stores data in rows and columns structure. You can add column to pandas dataframe using the df. insert(col_index_position, “Col_Name”, Col_Values_As_List, True) statement.
df['Counts'] = df.groupby(['Color'])['Value'].transform('count')
For example,
In [102]: df = pd.DataFrame({'Color': 'Red Red Blue'.split(), 'Value': [100, 150, 50]}) In [103]: df Out[103]: Color Value 0 Red 100 1 Red 150 2 Blue 50 In [104]: df['Counts'] = df.groupby(['Color'])['Value'].transform('count') In [105]: df Out[105]: Color Value Counts 0 Red 100 2 1 Red 150 2 2 Blue 50 1
Note that transform('count')
ignores NaNs. If you want to count NaNs, use transform(len)
.
To the anonymous editor: If you are getting an error while using transform('count')
it may be due to your version of Pandas being too old. The above works with pandas version 0.15 or newer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With