Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

groupby multiple values, and plotting results

I'm using some data on fungicide usage which has the Year, Fungicide, Amount used, along with some irrelevant columns in a panda DataFrame. It looks somewhat like:

Year, State,      Fungicide, Value
2011, California, A,         12879
2011, California, B,         29572
2011, Florida,    A,         8645
2011, Florida,    B,         19573
2009, California, A,         8764
2009, California, B,         98643,
...

What I want from it is a single plot of total fungicide used over time, with a line plotted for each individual fungicide (in a different colour). I've used .groupby to get the total amount of each fungicide used each year:

apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()

This gives me the values I want to plot, something like:

Year, Fungicide, Value
...
2009, A,        128635
      B,        104765
2011, A,        154829
      B,        129865

Now I need to plot it so that each fungicide (A, B, ...) is a separate line on a single plot of Value over Time

Is there a way of doing this without separating it all out? Forgive my ignorance, I'm new to python and am still getting familiar with it.

like image 580
A. Chatfield Avatar asked Dec 11 '15 14:12

A. Chatfield


1 Answers

For a clean solution that properly prints legend and xticks, you could

apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()
plot_df = apple_fplot.unstack('Fungicide').loc[:, 'Value']
plot_df.index = pd.PeriodIndex(plot_df.index.tolist(), freq='A')
plot_df.plot()

enter image description here For subplots, just set the respective keyword to True:

plot_df.plot(subplots=True)

to get:

enter image description here

like image 171
Stefan Avatar answered Oct 07 '22 14:10

Stefan