I am basically trying to reproduce climate diagrams showing mean temperature and precipitation over the year for various locations. I've generated a pivot table from my csv the following way: <pre class="prettyprint"><code>data = pd.read_csv("05_temp_rain_v2.csv") pivot = data.pivot_table(["rain(mm)","temp(dC)"], ["loc","month"]) </code></pre> sample data in text form: <pre class="prettyprint"><code>loc,lat,long,year,month,rain(mm),temp(dC) Adria_-_Bellombra,45.011129,12.034126,1994,1,45.6,4.6 Adria_-_Bellombra,45.011129,12.034126,1994,2,31.4,4 Adria_-_Bellombra,45.011129,12.034126,1994,3,1.6,10.7 Adria_-_Bellombra,45.011129,12.034126,1994,4,74.4,11.5 Adria_-_Bellombra,45.011129,12.034126,1994,5,26,17.2 Adria_-_Bellombra,45.011129,12.034126,1994,6,108.6,20.6 </code></pre> Pivot Table: <img src="https://i.stack.imgur.com/eB5Xn.png" alt="enter image description here"> Since I am handling various locations, I am iterating over them: <pre class="prettyprint"><code>locations=pivot.index.get_level_values(0).unique() for location in locations: split=pivot.xs(location) rain=split["rain(mm)"] temp=split["temp(dC)"] plt.subplots() temp.plot(kind="line",color="r",).legend() rain.plot(kind="bar").legend() </code></pre> An example plot output is shown below: <img src="https://i.stack.imgur.com/TnIIG.png" alt="enter image description here"> Why are my temperature values being plotted starting from February (2)? I assume it is because the temperature values are listed in the second column. What would be the proper way to handle and plot different data (two columns) from a pivot table?

It's because <code>line</code> and <code>bar</code> plots do not set the <code>xlim</code> the same way. The x-axis is interpreted as categorical data in case of the bar plot, whereas it is interpreted as continuous data for the line plot. The result being that <code>xlim</code> and <code>xticks</code> are not set identically in both situations. Consider this: <pre class="prettyprint"><code>In [4]: temp.plot(kind="line",color="r",) Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x117f555d0> In [5]: plt.xticks() Out[5]: (array([ 1., 2., 3., 4., 5., 6.]), <a list of 6 Text xticklabel objects>) </code></pre> where the position of the ticks is an array of float ranging from 1 to 6. and <pre class="prettyprint"><code>In [6]: rain.plot(kind="bar").legend() Out[6]: <matplotlib.legend.Legend at 0x11c15e950> In [7]: plt.xticks() Out[7]: (array([0, 1, 2, 3, 4, 5]), <a list of 6 Text xticklabel objects>) </code></pre> where the position of the ticks is an array of int ranging from 0 to 5. So, the easier is to replace this part: <pre class="prettyprint"><code>temp.plot(kind="line", color="r",).legend() rain.plot(kind="bar").legend() </code></pre> by: <pre class="prettyprint"><code>rain.plot(kind="bar").legend() plt.plot(range(len(temp)), temp, "r", label=temp.name) plt.legend() </code></pre> <img src="https://i.stack.imgur.com/0wFRo.png" alt="bar line plot pandas">

Pandas Plotting from Pivot Table

Tags:

python

python-3.x

pandas

matplotlib

pivot-table

I am basically trying to reproduce climate diagrams showing mean temperature and precipitation over the year for various locations.

I've generated a pivot table from my csv the following way:

data = pd.read_csv("05_temp_rain_v2.csv")
pivot = data.pivot_table(["rain(mm)","temp(dC)"], ["loc","month"])

sample data in text form:

loc,lat,long,year,month,rain(mm),temp(dC)
Adria_-_Bellombra,45.011129,12.034126,1994,1,45.6,4.6  
Adria_-_Bellombra,45.011129,12.034126,1994,2,31.4,4  
Adria_-_Bellombra,45.011129,12.034126,1994,3,1.6,10.7  
Adria_-_Bellombra,45.011129,12.034126,1994,4,74.4,11.5  
Adria_-_Bellombra,45.011129,12.034126,1994,5,26,17.2  
Adria_-_Bellombra,45.011129,12.034126,1994,6,108.6,20.6

Pivot Table:

enter image description here

Since I am handling various locations, I am iterating over them:

locations=pivot.index.get_level_values(0).unique()

for location in locations:
    split=pivot.xs(location)

    rain=split["rain(mm)"]
    temp=split["temp(dC)"]

    plt.subplots()
    temp.plot(kind="line",color="r",).legend()
    rain.plot(kind="bar").legend()

An example plot output is shown below:

enter image description here

Why are my temperature values being plotted starting from February (2)?
I assume it is because the temperature values are listed in the second column.

What would be the proper way to handle and plot different data (two columns) from a pivot table?

845

asked Mar 21 '16 13:03

cir

2 Answers

It's because line and bar plots do not set the xlim the same way. The x-axis is interpreted as categorical data in case of the bar plot, whereas it is interpreted as continuous data for the line plot. The result being that xlim and xticks are not set identically in both situations.

Consider this:

In [4]: temp.plot(kind="line",color="r",)
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x117f555d0>
In [5]: plt.xticks()
Out[5]: (array([ 1.,  2.,  3.,  4.,  5.,  6.]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of float ranging from 1 to 6.

and

In [6]: rain.plot(kind="bar").legend()
Out[6]: <matplotlib.legend.Legend at 0x11c15e950>
In [7]: plt.xticks()
Out[7]: (array([0, 1, 2, 3, 4, 5]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of int ranging from 0 to 5.

So, the easier is to replace this part:

temp.plot(kind="line", color="r",).legend()
rain.plot(kind="bar").legend()

by:

rain.plot(kind="bar").legend()
plt.plot(range(len(temp)), temp, "r", label=temp.name)
plt.legend()

bar line plot pandas

answered Sep 28 '22 12:09

jrjc

Thanks to jeanrjc's answer and this thread I think I'm finally quite satisfied!

for location in locations:
#print(pivot.xs(location, level=0))

split=pivot.xs(location)
rain=split["rain(mm)"]
temp=split["temp(dC)"]

fig = plt.figure()
ax1 = rain.plot(kind="bar")
ax2 = ax1.twinx()
ax2.plot(ax1.get_xticks(),temp,linestyle='-',color="r")
ax2.set_ylim((-5, 50.))
#ax1.set_ylim((0, 300.))
ax1.set_ylabel('Precipitation (mm)', color='blue')
ax2.set_ylabel('Temperature (°C)', color='red')
ax1.set_xlabel('Months')
plt.title(location)
labels = ['Jan','Feb','Mar','Apr','May','Jun', 'Jul','Aug','Sep','Oct','Nov','Dez']
#plt.xticks(range(12),labels,rotation=45)
ax1.set_xticklabels(labels, rotation=45)

I am receiving the following output, which is very close to what I intend: sample plot

answered Sep 28 '22 11:09

cir

Related questions
                            
                                Apply a python decorator to all inheriting classes
                            
                                How do you use pagination in a Django REST framework ViewSet subclass?
                            
                                How to transform a pair of values into a sorted unique array?
                            
                                Remove a section of a colormap
                            
                                Celery task history
                            
                                Django Shell image upload _io.BufferedReader no attribute size
                            
                                Numpy drawing from urn
                            
                                What is a typical instance of using '__rsub__' method in Python?
                            
                                fastest way to find the smallest positive real root of quartic polynomial 4 degree in python
                            
                                Modify a Data Frame column with list comprehension
                            
                                Creating a Pandas Series with a period in the name
                            
                                Why it's not ok for variables to be global but it's ok for functions?
                            
                                Zip key value pairs in python
                            
                                Python zip folder without including './' (current directory)
                            
                                How to get all elements of 2D array by slice in python
                            
                                Generating username using python
                            
                                Failed to add documents to Solr: Solr responded with an error (HTTP 400) (django + haystack + solr)
                            
                                python advance for loop
                            
                                Python3 NameError: name 'method' is not defined for defined @staticmethod
                            
                                Why does scipy.optimize.minimize (default) report success without moving with Skyfield?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With