Plotting results of Pandas GroupBy

Tags:

I'm starting to learn Pandas and am trying to find the most Pythonic (or panda-thonic?) ways to do certain tasks.

Suppose we have a DataFrame with columns A, B, and C.

Column A contains boolean values: each row's A value is either true or false.
Column B has some important values we want to plot.

What we want to discover is the subtle distinctions between B values for rows that have A set to false, vs. B values for rows that have A is true.

In other words, how can I group by the value of column A (either true or false), then plot the values of column B for both groups on the same graph? The two datasets should be colored differently to be able to distinguish the points.

Next, let's add another feature to this program: before graphing, we want to compute another value for each row and store it in column D. This value is the mean of all data stored in B for the entire five minutes before a record - but we only include rows that have the same boolean value stored in A.

In other words, if I have a row where A=True and time=t, I want to compute a value for column D that is the mean of B for all records from time t-5 to t that have the same A=True.

In this case, how can we execute the groupby on values of A, then apply this computation to each individual group, and finally plot the D values for the two groups?

648

asked Mar 17 '13 20:03

Maxim Zaslavsky

1 Answers

I think @herrfz hit all the high points. I'll just flesh out the details:

import pandas as pd import numpy as np import matplotlib.pyplot as plt  sin = np.sin cos = np.cos pi = np.pi N = 100  x = np.linspace(0, pi, N) a = sin(x) b = cos(x)  df = pd.DataFrame({     'A': [True]*N + [False]*N,     'B': np.hstack((a,b))     })  for key, grp in df.groupby(['A']):     plt.plot(grp['B'], label=key)     grp['D'] = pd.rolling_mean(grp['B'], window=5)         plt.plot(grp['D'], label='rolling ({k})'.format(k=key)) plt.legend(loc='best')     plt.show()

enter image description here

118

answered Oct 03 '22 12:10

unutbu

Related questions
                            
                                numerically stable way to multiply log probability matrices in numpy
                            
                                How to obtain the same font(-style, -size etc.) in matplotlib output as in latex output?
                            
                                Name of a function returning a generator
                            
                                Scrapy Python Set up User Agent
                            
                                Necessity of explicit cursor.close()
                            
                                "Too many indexers" with DataFrame.loc
                            
                                Airbnb Airflow vs Apache Nifi [closed]
                            
                                Does get_or_create() have to save right away? (Django)
                            
                                Commit in git only if tests pass
                            
                                Why does pandas apply calculate twice
                            
                                How to use gettext with python >3.6 f-strings
                            
                                Nodejs: Where or How to write complicated business logic?
                            
                                Numpy quirk: Apply function to all pairs of two 1D arrays, to get one 2D array
                            
                                Cyclic module dependencies and relative imports in Python
                            
                                Pip install forked github-repo
                            
                                How is the feature score(/importance) in the XGBoost package calculated?
                            
                                Angle between points?
                            
                                Python's equivalent for R's dput() function
                            
                                Python >=3.5: Checking type annotation at runtime
                            
                                104, 'Connection reset by peer' socket error, or When does closing a socket result in a RST rather than FIN?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Plotting results of Pandas GroupBy

Tags:

python

pandas

matplotlib

group-by

data-analysis

Maxim Zaslavsky

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us