One last newbie pandas question for the day: How do I generate a table for a single Series? For example: <pre class="prettyprint"><code>my_series = pandas.Series([1,2,2,3,3,3]) pandas.magical_frequency_function( my_series ) >> { 1 : 1, 2 : 2, 3 : 3 } </code></pre> Lots of googling has led me to Series.describe() and pandas.crosstabs, but neither of these does quite what I need: one variable, counts by categories. Oh, and it'd be nice if it worked for different data types: strings, ints, etc.

Maybe <code>.value_counts()</code>? <pre class="prettyprint"><code>>>> import pandas >>> my_series = pandas.Series([1,2,2,3,3,3, "fred", 1.8, 1.8]) >>> my_series 0 1 1 2 2 2 3 3 4 3 5 3 6 fred 7 1.8 8 1.8 >>> counts = my_series.value_counts() >>> counts 3 3 2 2 1.8 2 fred 1 1 1 >>> len(counts) 5 >>> sum(counts) 9 >>> counts["fred"] 1 >>> dict(counts) {1.8: 2, 2: 2, 3: 3, 1: 1, 'fred': 1} </code></pre>

You can use list comprehension on a dataframe to count frequencies of the columns as such <pre class="prettyprint"><code>[my_series[c].value_counts() for c in list(my_series.select_dtypes(include=['O']).columns)] </code></pre> Breakdown: <pre class="prettyprint"><code>my_series.select_dtypes(include=['O']) </code></pre> <blockquote> <blockquote> Selects just the categorical data </blockquote> </blockquote> <pre class="prettyprint"><code>list(my_series.select_dtypes(include=['O']).columns) </code></pre> <blockquote> <blockquote> Turns the columns from above into a list </blockquote> </blockquote> <pre class="prettyprint"><code>[my_series[c].value_counts() for c in list(my_series.select_dtypes(include=['O']).columns)] </code></pre> <blockquote> <blockquote> Iterates through the list above and applies value_counts() to each of the columns </blockquote> </blockquote>

Frequency table for a single variable

Tags:

python

pandas

statistics

frequency

One last newbie pandas question for the day: How do I generate a table for a single Series?

For example:

my_series = pandas.Series([1,2,2,3,3,3]) pandas.magical_frequency_function( my_series )  >> {      1 : 1,      2 : 2,       3 : 3    }

Lots of googling has led me to Series.describe() and pandas.crosstabs, but neither of these does quite what I need: one variable, counts by categories. Oh, and it'd be nice if it worked for different data types: strings, ints, etc.

276

asked Aug 31 '12 00:08

Abe

2 Answers

Maybe .value_counts()?

>>> import pandas >>> my_series = pandas.Series([1,2,2,3,3,3, "fred", 1.8, 1.8]) >>> my_series 0       1 1       2 2       2 3       3 4       3 5       3 6    fred 7     1.8 8     1.8 >>> counts = my_series.value_counts() >>> counts 3       3 2       2 1.8     2 fred    1 1       1 >>> len(counts) 5 >>> sum(counts) 9 >>> counts["fred"] 1 >>> dict(counts) {1.8: 2, 2: 2, 3: 3, 1: 1, 'fred': 1}

182

answered Oct 21 '22 22:10

DSM

You can use list comprehension on a dataframe to count frequencies of the columns as such

[my_series[c].value_counts() for c in list(my_series.select_dtypes(include=['O']).columns)]

Breakdown:

my_series.select_dtypes(include=['O'])

Selects just the categorical data

list(my_series.select_dtypes(include=['O']).columns)

Turns the columns from above into a list

[my_series[c].value_counts() for c in list(my_series.select_dtypes(include=['O']).columns)]

Iterates through the list above and applies value_counts() to each of the columns

answered Oct 21 '22 23:10

Shankar ARUL

Related questions
                            
                                Django: Get an object form the DB, or 'None' if nothing matches
                            
                                Most lightweight way to create a random string and a random hexadecimal number
                            
                                How to check whether a method exists in Python?
                            
                                Python script to do something at the same time every day [duplicate]
                            
                                pip installation /usr/local/opt/python/bin/python2.7: bad interpreter: No such file or directory
                            
                                TensorFlow saving into/loading a graph from a file
                            
                                Python 3 - Encode/Decode vs Bytes/Str [duplicate]
                            
                                Get class that defined method
                            
                                Time complexity of python set operations?
                            
                                What are "soft keywords"?
                            
                                Cell-var-from-loop warning from Pylint
                            
                                What are the risks of running 'sudo pip'?
                            
                                Can sphinx link to documents that are not located in directories below the root document?
                            
                                Dead simple example of using Multiprocessing Queue, Pool and Locking
                            
                                Copy constructor in python?
                            
                                Python - Join with newline
                            
                                How to implement an efficient bidirectional hash table?
                            
                                Python - difference between two strings
                            
                                Change values on matplotlib imshow() graph axis
                            
                                Using a global variable with a thread

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With