Count by unique pair of columns in pandas [duplicate]

Tags:

python

pandas

I'm trying to figure out how to count by number of rows per unique pair of columns (ip, useragent), e.g.

d = pd.DataFrame({'ip': ['192.168.0.1', '192.168.0.1', '192.168.0.1', '192.168.0.2'], 'useragent': ['a', 'a', 'b', 'b']})       ip              useragent 0    192.168.0.1     a 1    192.168.0.1     a 2    192.168.0.1     b 3    192.168.0.2     b

To produce:

ip           useragent   192.168.0.1  a           2 192.168.0.1  b           1 192.168.0.2  b           1

Ideas?

586

asked Dec 01 '12 13:12

barnybug

2 Answers

If you use groupby, you will get what you want.

d.groupby(['ip', 'useragent']).size()

produces:

ip          useragent                192.168.0.1 a           2             b           1 192.168.0.2 b           1

answered Sep 22 '22 17:09

Matti John

print(d.groupby(['ip', 'useragent']).size().reset_index().rename(columns={0:''}))

gives:

            ip useragent    0  192.168.0.1         a  2 1  192.168.0.1         b  1 2  192.168.0.2         b  1

Another nice option might be pandas.crosstab:

print(pd.crosstab(d.ip, d.useragent) ) print('\nsome cosmetics:') print(pd.crosstab(d.ip, d.useragent).reset_index().rename_axis('',axis='columns') )

gives:

useragent    a  b ip                192.168.0.1  2  1 192.168.0.2  0  1  some cosmetics:             ip  a  b 0  192.168.0.1  2  1 1  192.168.0.2  0  1

answered Sep 22 '22 17:09

Markus Dutschke

Related questions
                            
                                Python: How to drop a row whose particular column is empty/NaN?
                            
                                Getting No loop matching the specified signature and casting error
                            
                                How do I specify multiple types for a parameter using type-hints? [duplicate]
                            
                                from __future__ import annotations
                            
                                Django BigInteger auto-increment field as primary key?
                            
                                Is there a way to hide the csrf label while looping through form using Flask and Flask-WTForms?
                            
                                Python Serial: How to use the read or readline function to read more than 1 character at a time
                            
                                ExcelFile Vs. read_excel in pandas
                            
                                Python - splitting dataframe into multiple dataframes based on column values and naming them with those values [duplicate]
                            
                                django npm and node packages architecture
                            
                                How do I draw text at an angle using python's PIL?
                            
                                Programmatically determining amount of parameters a function requires - Python [duplicate]
                            
                                SqlAlchemy and Flask, how to query many-to-many relationship
                            
                                Bar Chart: How to choose color if value is positive vs value is negative
                            
                                Selenium Compound class names not permitted
                            
                                How to groupby consecutive values in pandas DataFrame
                            
                                No module named "Torch"
                            
                                How do I deploy a Flask application in IIS?
                            
                                How to write Python generator function that never yields anything
                            
                                Handling a timeout error in python sockets

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With