Hello I have the following dataframe. <pre class="prettyprint"><code> Group Size Short Small Short Small Moderate Medium Moderate Small Tall Large </code></pre> I want to count the frequency of how many time the same row appears in the dataframe. <pre class="prettyprint"><code> Group Size Time Short Small 2 Moderate Medium 1 Moderate Small 1 Tall Large 1 </code></pre>

Other posibbility is using <code>.pivot_table()</code> and <code>aggfunc='size'</code> <pre class="prettyprint"><code>df_solution = df.pivot_table(index=['Group','Size'], aggfunc='size') </code></pre>

Python: get a frequency count based on two columns (variables) in pandas dataframe some row appers

Tags:

python

pandas

dataframe

group-by

Hello I have the following dataframe.

    Group           Size

    Short          Small
    Short          Small
    Moderate       Medium
    Moderate       Small
    Tall           Large

I want to count the frequency of how many time the same row appears in the dataframe.

    Group           Size      Time

    Short          Small        2
    Moderate       Medium       1 
    Moderate       Small        1
    Tall           Large        1

323

asked Oct 21 '15 23:10

emax

Video Answer

3 Answers

You can use groupby's size:

In [11]: df.groupby(["Group", "Size"]).size()
Out[11]:
Group     Size
Moderate  Medium    1
          Small     1
Short     Small     2
Tall      Large     1
dtype: int64

In [12]: df.groupby(["Group", "Size"]).size().reset_index(name="Time")
Out[12]:
      Group    Size  Time
0  Moderate  Medium     1
1  Moderate   Small     1
2     Short   Small     2
3      Tall   Large     1

105

answered Oct 20 '22 12:10

Andy Hayden

Update after pandas 1.1 value_counts now accept multiple columns

df.value_counts(["Group", "Size"])

You can also try pd.crosstab()

Group           Size

Short          Small
Short          Small
Moderate       Medium
Moderate       Small
Tall           Large

pd.crosstab(df.Group,df.Size)


Size      Large  Medium  Small
Group                         
Moderate      0       1      1
Short         0       0      2
Tall          1       0      0

EDIT: In order to get your out put

pd.crosstab(df.Group,df.Size).replace(0,np.nan).\
     stack().reset_index().rename(columns={0:'Time'})
Out[591]: 
      Group    Size  Time
0  Moderate  Medium   1.0
1  Moderate   Small   1.0
2     Short   Small   2.0
3      Tall   Large   1.0

answered Oct 20 '22 12:10

BENY

Other posibbility is using .pivot_table() and aggfunc='size'

df_solution = df.pivot_table(index=['Group','Size'], aggfunc='size')

answered Oct 20 '22 14:10

asantz96

Related questions
                            
                                Is python's sorted() function guaranteed to be stable?
                            
                                Python Pandas : group by in group by and average?
                            
                                Adding meta-information/metadata to pandas DataFrame
                            
                                sort eigenvalues and associated eigenvectors after using numpy.linalg.eig in python
                            
                                Python: reload component Y imported with 'from X import Y'?
                            
                                Django rest framework serializing many to many field
                            
                                Foreign key from one app into another in Django
                            
                                ipython reads wrong python version
                            
                                How to parse/read a YAML file into a Python object? [duplicate]
                            
                                Python: Tuples/dictionaries as keys, select, sort
                            
                                In Django - Model Inheritance - Does it allow you to override a parent model's attribute?
                            
                                Argmax of numpy array returning non-flat indices
                            
                                Matplotlib Legends not working
                            
                                Split models.py into several files
                            
                                How to do math in a Django template?
                            
                                Default value for field in Django model
                            
                                In python, how do I cast a class object to a dict
                            
                                How do I access the command history from IDLE?
                            
                                Most efficient way to find mode in numpy array
                            
                                Assigning a variable NaN in python without numpy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With