pandas: count things

Tags:

In the following, male_trips is a big pandas data frame and stations is a small pandas data frame. For each station id I'd like to know how many male trips took place. The following does the job, but takes a long time:

mc = [ sum( male_trips['start_station_id'] == id ) for id in stations['id'] ]

how should I go about this instead?

Update! So there were two main approaches: groupby() followed by size(), and the simpler .value_counts(). I did a quick timeit, and the groupby approach wins by quite a large margin! Here is the code:

from timeit import Timer setup = "import pandas; male_trips=pandas.load('maletrips')" a  = "male_trips.start_station_id.value_counts()" b = "male_trips.groupby('start_station_id').size()" Timer(a,setup).timeit(100) Timer(b,setup).timeit(100)

and here is the result:

In [4]: Timer(a,setup).timeit(100) # <- this is value_counts Out[4]: 9.709594964981079  In [5]: Timer(b,setup).timeit(100) # <- this is groupby / size Out[5]: 1.5574288368225098

Note that, at this speed, for exploring data typing value_counts is marginally quicker and less remembering!

398

asked Oct 12 '12 21:10

Mike Dewar

1 Answers

I'd do like Vishal but instead of using sum() using size() to get a count of the number of rows allocated to each group of 'start_station_id'. So:

df = male_trips.groupby('start_station_id').size()

answered Oct 01 '22 12:10

Dani Arribas-Bel

Related questions
                            
                                Converting .jpg images to .png
                            
                                Got continuous is not supported error in RandomForestRegressor
                            
                                Why is this simple conditional expression not working? [duplicate]
                            
                                Test if a python string is printable
                            
                                What is the simplest way to create an empty iterable using yield in Python?
                            
                                plotting value_counts() in seaborn barplot
                            
                                Python: UserWarning: This pattern has match groups. To actually get the groups, use str.extract
                            
                                Replace comma with dot Pandas
                            
                                Failed to find data adapter that can handle input: <class 'numpy.ndarray'>, (<class 'list'> containing values of types {"<class 'int'>"})
                            
                                Fitting a gamma distribution with (python) Scipy
                            
                                How to finish sys.stdin.readlines() input?
                            
                                Python recursively replace character in keys of nested dictionary?
                            
                                pip broken after upgrading
                            
                                How to convert singleton array to a scalar value in Python?
                            
                                Plot pie chart and table of pandas dataframe
                            
                                How to make the angles in a matplotlib polar plot go clockwise with 0° at the top?
                            
                                Creating Signed URLs for Amazon CloudFront
                            
                                Matching partial ids in BeautifulSoup
                            
                                Extracting only characters from a string in Python
                            
                                (Python) Counting lines in a huge (>10GB) file as fast as possible [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas: count things

Tags:

python

pandas

Mike Dewar

People also ask

1 Answers

Dani Arribas-Bel

Recent Activity

Donate For Us