Python Pandas: Convert ".value_counts" output to dataframe

Tags:

Hi I want to get the counts of unique values of the dataframe. count_values implements this however I want to use its output somewhere else. How can I convert .count_values output to a pandas dataframe. here is an example code:

import pandas as pd df = pd.DataFrame({'a':[1, 1, 2, 2, 2]}) value_counts = df['a'].value_counts(dropna=True, sort=True) print(value_counts) print(type(value_counts))

output is:

2    3 1    2 Name: a, dtype: int64 <class 'pandas.core.series.Series'>

What I need is a dataframe like this:

unique_values  counts 2              3 1              2

Thank you.

282

asked Nov 06 '17 11:11

s900n

2 Answers

Use rename_axis for name of column from index and reset_index:

df = df.value_counts().rename_axis('unique_values').reset_index(name='counts') print (df)    unique_values  counts 0              2       3 1              1       2

Or if need one column DataFrame use Series.to_frame:

df = df.value_counts().rename_axis('unique_values').to_frame('counts') print (df)                counts unique_values         2                   3 1                   2

answered Sep 19 '22 14:09

jezrael

I just run into the same problem, so I provide my thoughts here.

Warning

When you deal with the data structure of Pandas, you have to aware of the return type.

Another solution here

Like @jezrael mentioned before, Pandas do provide API pd.Series.to_frame.

Step 1

You can also wrap the pd.Series to pd.DataFrame by just doing

df_val_counts = pd.DataFrame(value_counts) # wrap pd.Series to pd.DataFrame

Then, you have a pd.DataFrame with column name 'a', and your first column become the index

Input:  print(df_value_counts.index.values) Output: [2 1]  Input:  print(df_value_counts.columns) Output: Index(['a'], dtype='object')

Step 2

What now?

If you want to add new column names here, as a pd.DataFrame, you can simply reset the index by the API of reset_index().

And then, change the column name by a list by API df.coloumns

df_value_counts = df_value_counts.reset_index() df_value_counts.columns = ['unique_values', 'counts']

Then, you got what you need

Output:         unique_values    counts     0              2         3     1              1         2

Full Answer here

import pandas as pd  df = pd.DataFrame({'a':[1, 1, 2, 2, 2]}) value_counts = df['a'].value_counts(dropna=True, sort=True)  # solution here df_val_counts = pd.DataFrame(value_counts) df_value_counts_reset = df_val_counts.reset_index() df_value_counts_reset.columns = ['unique_values', 'counts'] # change column names

answered Sep 16 '22 14:09

WY Hsu

Related questions
                            
                                Why does Python's dict.keys() return a list and not a set?
                            
                                PYTHONPATH vs. sys.path
                            
                                how to tell a variable is iterable but not a string
                            
                                Why is parenthesis in print voluntary in Python 2.7?
                            
                                What is the __dict__.__dict__ attribute of a Python class?
                            
                                Any gotchas using unicode_literals in Python 2.6?
                            
                                How to use requirements.txt to install all dependencies in a python project
                            
                                Weird Try-Except-Else-Finally behavior with Return statements
                            
                                Django filter many-to-many with contains
                            
                                How to get folder name, in which given file resides, from pathlib.path?
                            
                                Prevent pandas from interpreting 'NA' as NaN in a string
                            
                                How to read a Parquet file into Pandas DataFrame?
                            
                                In python, how to import filename starts with a number
                            
                                Python: using a recursive algorithm as a generator
                            
                                Understanding lambda in python and using it to pass multiple arguments
                            
                                Parsing non-zero padded timestamps in Python
                            
                                Full examples of using pySerial package [closed]
                            
                                Python, what's the Enum type good for? [duplicate]
                            
                                Implementing use of 'with object() as f' in custom class in python
                            
                                How to locate and insert a value in a text box (input) using Python Selenium?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With