I learned python and pandas before SQL, so this question is a bit basic. For example, I have a <code>type</code> column with values like 1, 2, 3. Then when I do df['type'].value_counts, I can get the statistics of the <code>type</code>, maybe something like <pre class="prettyprint"><code>1: 1000 rows 2: 220 rows 3: 100 rows </code></pre> I want to know What is the equivalent in SQL? I believe it should be something about group_by and count?

If you want to know how many times each value occurs in a column, use: <pre class="prettyprint"><code>SELECT type, count(*) FROM table GROUP BY type </code></pre>

<pre class="prettyprint"><code>SELECT type, count(1) as num_types FROM table GROUP BY type </code></pre> will return the equivalent row counts.

What is the equivalent of Python Pandas value_counts in SQL?

Tags:

python

sql

pandas

I learned python and pandas before SQL, so this question is a bit basic.

For example, I have a type column with values like 1, 2, 3.

Then when I do df['type'].value_counts, I can get the statistics of the type, maybe something like

1: 1000 rows
2: 220 rows
3: 100 rows

I want to know What is the equivalent in SQL? I believe it should be something about group_by and count?

592

asked Mar 14 '18 02:03

cqcn1991

2 Answers

If you want to know how many times each value occurs in a column, use:

SELECT type, count(*)
FROM table
GROUP BY type

answered Oct 28 '22 16:10

Turophile

SELECT type, count(1) as num_types
FROM table
GROUP BY type

will return the equivalent row counts.

answered Oct 28 '22 18:10

Ash Chakraborty

Related questions
                            
                                Why I can't use python-cjson with Python 3.x?
                            
                                Is there a fast algorithm to remove repeated substrings in a string?
                            
                                Maximize Optimization using Scipy
                            
                                SSIS Execute Process Task Python script
                            
                                What is the role of Django csrf token? [closed]
                            
                                Python pandas Timestamp.week returns 52 for first day of year
                            
                                How to plot multi-color line if x-axis is date time index of pandas
                            
                                Getting duplicate keys in YAML using Python
                            
                                Noise Reduction in an Audio file using Python [closed]
                            
                                Matplotlib's autoscale doesn't seem to work on y axis for small values?
                            
                                Django: Is APPEND_SLASH set to True even if not in settings.py?
                            
                                How to get PI in tensorflow?
                            
                                Plotly: How to select graph source using dropdown?
                            
                                How to download an image with Python 3/Selenium if the URL begins with "blob:"?
                            
                                Python Using List/Multiple Arguments in Pool Map
                            
                                Tensorflow: 'tf.get_default_session()` after sess=tf.Session() is None
                            
                                How to optimize a nested for loop in Python
                            
                                pandas dataframe resample aggregate function use multiple columns with a customized function?
                            
                                BatchNorm momentum convention PyTorch
                            
                                More efficient weighted Gini coefficient in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With