Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Histogram has only one bar

My data--a 196,585-record numpy array extracted from a pandas dataframe--are being placed into a single bin by matplotlib.hist. The data were originally integers, so I tried converting them to float as wel, as shown below, but they are still not being distributed among 10 bins.

Interestingly, a small sub-sample (using df.sample(0.00x)) of the integer data are successfully distributed.

Any suggestions on where I may be erring in data preparation or use of matplotlib's histogram function would be appreciated.

histogram output

x = df[(df['UNIT']=='X')].OPP_VALUE.values
num_bins = 10
n, bins, patches = plt.hist((x[(x>0)]).astype(float), num_bins, normed=False, facecolor='0.5', alpha=0.8)
plt.show()
like image 826
A. Slowey Avatar asked Aug 02 '16 17:08

A. Slowey


1 Answers

Most likely what is happening is that the number of data points with x > 0.5 is very small but you do have some outliers that forces the hist function to pick the scale it does. Try removing all values > 0.5 (or 1 if you do not want to convert to float) and then plot again.

like image 136
Lakshmi Prakash Avatar answered Oct 13 '22 16:10

Lakshmi Prakash