I've got data in the below format, and what I'm trying to do is to:
1) loop over each value in Region
2) For each region, plot a time series of the aggregated (across Category) sales number.
Date |Region |Category | Sales
01/01/2016| USA| Furniture|1
01/01/2016| USA| Clothes |0
01/01/2016| Europe| Furniture|2
01/01/2016| Europe| Clothes |0
01/02/2016| USA| Furniture|3
01/02/2016| USA|Clothes|0
01/02/2016| Europe| Furniture|4
01/02/2016| Europe| Clothes|0 ...
The plot should look like the attached (done in excel).

However, if I try to do it in Python using the below, I get multiple charts when I really want all the lines to show up in one figure.
Python code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv(r'C:\Users\wusm\Desktop\Book7.csv')
plt.legend()
for index, group in df.groupby(["Region"]):
group.plot(x='Date',y='Sales',title=str(index))
plt.show()
Short of reformatting the data, could anyone advise on how to get the graphs in one figure please?
You can use pivot_table:
df = df.pivot_table(index='Date', columns='Region', values='Sales', aggfunc='sum')
print (df)
Region Europe USA
Date
01/01/2016 2 1
01/02/2016 4 3
or groupby + sum + unstack:
df = df.groupby(['Date', 'Region'])['Sales'].sum().unstack(fill_value=0)
print (df)
Region Europe USA
Date
01/01/2016 2 1
01/02/2016 4 3
and then DataFrame.plot
df.plot()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With