Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib/Pandas error using histogram

I have a problem making histograms from pandas series objects and I can't understand why it does not work. The code has worked fine before but now it does not.

Here is a bit of my code (specifically, a pandas series object I'm trying to make a histogram of):

type(dfj2_MARKET1['VSPD2_perc']) 

which outputs the result: pandas.core.series.Series

Here's my plotting code:

fig, axes = plt.subplots(1, 7, figsize=(30,4)) axes[0].hist(dfj2_MARKET1['VSPD1_perc'],alpha=0.9, color='blue') axes[0].grid(True) axes[0].set_title(MARKET1 + '  5-40 km / h') 

Error message:

    AttributeError                            Traceback (most recent call last)     <ipython-input-75-3810c361db30> in <module>()       1 fig, axes = plt.subplots(1, 7, figsize=(30,4))       2      ----> 3 axes[1].hist(dfj2_MARKET1['VSPD2_perc'],alpha=0.9, color='blue')       4 axes[1].grid(True)       5 axes[1].set_xlabel('Time spent [%]')      C:\Python27\lib\site-packages\matplotlib\axes.pyc in hist(self, x, bins, range, normed,          weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label,    stacked, **kwargs)    8322             # this will automatically overwrite bins,    8323             # so that each histogram uses the same bins -> 8324             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)    8325             m = m.astype(float) # causes problems later if it's an int    8326             if mlast is None:      C:\Python27\lib\site-packages\numpy\lib\function_base.pyc in histogram(a, bins, range,     normed, weights, density)     158         if (mn > mx):     159             raise AttributeError( --> 160                 'max must be larger than min in range parameter.')     161      162     if not iterable(bins):  AttributeError: max must be larger than min in range parameter. 
like image 760
jonas Avatar asked Dec 18 '13 11:12

jonas


People also ask

How do I show histogram in Matplotlib?

In Matplotlib, we use the hist() function to create histograms. The hist() function will use an array of numbers to create a histogram, the array is sent into the function as an argument.

How do you plot a histogram in pandas?

In order to plot a histogram using pandas, chain the . hist() function to the dataframe. This will return the histogram for each numeric column in the pandas dataframe.

How do I make my Matplotlib histogram look better?

We can achieve this by increasing the number of bins, which is essentially the number of classes the histogram divides the data into. More bins will make the histogram smoother.

How do you plot a histogram in Matplotlib explain with example?

To create a histogram the first step is to create bin of the ranges, then distribute the whole range of the values into a series of intervals, and count the values which fall into each of the intervals. Bins are clearly identified as consecutive, non-overlapping intervals of variables. The matplotlib. pyplot.


2 Answers

This error occurs among other things when you have NaN values in the Series. Could that be the case?

These NaN's are not handled well by the hist function of matplotlib. For example:

s = pd.Series([1,2,3,2,2,3,5,2,3,2,np.nan]) fig, ax = plt.subplots() ax.hist(s, alpha=0.9, color='blue') 

produces the same error AttributeError: max must be larger than min in range parameter. One option is eg to remove the NaN's before plotting. This will work:

ax.hist(s.dropna(), alpha=0.9, color='blue') 

Another option is to use pandas hist method on your series and providing the axes[0] to the ax keyword:

dfj2_MARKET1['VSPD1_perc'].hist(ax=axes[0], alpha=0.9, color='blue') 
like image 109
joris Avatar answered Oct 02 '22 10:10

joris


The error is rightly due to NaN values as explained above. Just use:

df = df['column_name'].apply(pd.to_numeric) 

if the value is not numeric and then apply:

df = df['column_name'].replace(np.nan, your_value) 
like image 36
brainhack Avatar answered Oct 02 '22 09:10

brainhack