I am basically trying to reproduce climate diagrams showing mean temperature and precipitation over the year for various locations.
I've generated a pivot table from my csv the following way:
data = pd.read_csv("05_temp_rain_v2.csv")
pivot = data.pivot_table(["rain(mm)","temp(dC)"], ["loc","month"])
sample data in text form:
loc,lat,long,year,month,rain(mm),temp(dC)
Adria_-_Bellombra,45.011129,12.034126,1994,1,45.6,4.6
Adria_-_Bellombra,45.011129,12.034126,1994,2,31.4,4
Adria_-_Bellombra,45.011129,12.034126,1994,3,1.6,10.7
Adria_-_Bellombra,45.011129,12.034126,1994,4,74.4,11.5
Adria_-_Bellombra,45.011129,12.034126,1994,5,26,17.2
Adria_-_Bellombra,45.011129,12.034126,1994,6,108.6,20.6
Pivot Table:
Since I am handling various locations, I am iterating over them:
locations=pivot.index.get_level_values(0).unique()
for location in locations:
split=pivot.xs(location)
rain=split["rain(mm)"]
temp=split["temp(dC)"]
plt.subplots()
temp.plot(kind="line",color="r",).legend()
rain.plot(kind="bar").legend()
An example plot output is shown below:
Why are my temperature values being plotted starting from February (2)?
I assume it is because the temperature values are listed in the second column.
What would be the proper way to handle and plot different data (two columns) from a pivot table?
Plotting. Pandas pivot tables can be used in conjunction with the Pandas plotting functionality to create useful data visualisations. Simply adding . plot() to the end of your pivot table code will create a plot of the data.
Create a chart from a PivotTable Select a cell in your table. Select PivotTable Tools > Analyze > PivotChart. Select a chart. Select OK.
Pandas uses the plot() method to create diagrams. We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen.
Creating a pivot table. By specifying the index and columns parameters in the pd. pivot_table() function, you can determine which features should appear in the columns and rows. In the values parameter, you should specify which feature should be used to fill in the cell values.
It's because line
and bar
plots do not set the xlim
the same way. The x-axis is interpreted as categorical data in case of the bar plot, whereas it is interpreted as continuous data for the line plot. The result being that xlim
and xticks
are not set identically in both situations.
Consider this:
In [4]: temp.plot(kind="line",color="r",)
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x117f555d0>
In [5]: plt.xticks()
Out[5]: (array([ 1., 2., 3., 4., 5., 6.]), <a list of 6 Text xticklabel objects>)
where the position of the ticks is an array of float ranging from 1 to 6.
and
In [6]: rain.plot(kind="bar").legend()
Out[6]: <matplotlib.legend.Legend at 0x11c15e950>
In [7]: plt.xticks()
Out[7]: (array([0, 1, 2, 3, 4, 5]), <a list of 6 Text xticklabel objects>)
where the position of the ticks is an array of int ranging from 0 to 5.
So, the easier is to replace this part:
temp.plot(kind="line", color="r",).legend()
rain.plot(kind="bar").legend()
by:
rain.plot(kind="bar").legend()
plt.plot(range(len(temp)), temp, "r", label=temp.name)
plt.legend()
Thanks to jeanrjc's answer and this thread I think I'm finally quite satisfied!
for location in locations:
#print(pivot.xs(location, level=0))
split=pivot.xs(location)
rain=split["rain(mm)"]
temp=split["temp(dC)"]
fig = plt.figure()
ax1 = rain.plot(kind="bar")
ax2 = ax1.twinx()
ax2.plot(ax1.get_xticks(),temp,linestyle='-',color="r")
ax2.set_ylim((-5, 50.))
#ax1.set_ylim((0, 300.))
ax1.set_ylabel('Precipitation (mm)', color='blue')
ax2.set_ylabel('Temperature (°C)', color='red')
ax1.set_xlabel('Months')
plt.title(location)
labels = ['Jan','Feb','Mar','Apr','May','Jun', 'Jul','Aug','Sep','Oct','Nov','Dez']
#plt.xticks(range(12),labels,rotation=45)
ax1.set_xticklabels(labels, rotation=45)
I am receiving the following output, which is very close to what I intend:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With