I'm trying to plot a histogram of a column in a pandas series ('df_plot'). Since I want the y-axis to be a percentage (and not the count), I use the weights option achieve this. As you find in the stacktrace below, the weights array and data series are of the same shape. How come that I still get the error telling me w and x are not the same shape?
Code:
w = 100*(np.zeros_like(df_plot[var]) + 1. / len(df_plot[var]))
print w.shape
print df_plot[var].shape
df_plot[var].hist(bins=100, cumulative=True, weights=w)
Stacktrace:
(9066,)
(9066,)

Traceback (most recent call last):
File "<ipython-input-59-5612307b159e>", line 4, in <module>
df_plot[var].hist(bins=100, cumulative=True, weights=w)
File "C:\Anaconda\lib\site-packages\pandas\tools\plotting.py", line 2819, in hist_series
ax.hist(values, bins=bins, **kwds)
File "C:\Anaconda\lib\site-packages\matplotlib\axes\_axes.py", line 5649, in hist
'weights should have the same shape as x')
ValueError: weights should have the same shape as x
We can normalize a histogram in Matplotlib using the density keyword argument and setting it to True . By normalizing a histogram, the sum of the bar area equals 1.
histtype : This parameter is an optional parameter and it is used to draw type of histogram. {'bar', 'barstacked', 'step', 'stepfilled'}
The space between bars can be added by using rwidth parameter inside the “plt. hist()” function. This value specifies the width of the bar with respect to its default width and the value of rwidth cannot be greater than 1.
you have nulls in your data set.
s = df_plot[var].dropna()
w = 100*(np.zeros_like(s) + 1. / len(s))
s.hist(bins=100, cumulative=True, weights=w)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With