I am getting the following error while I am trying to plot a pandas dataframe
:
ValueError: num must be 1 <= num <= 0, not 1
Code:
import matplotlib.pyplot as plt
names = ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety']
custom = pd.DataFrame(x_train) //only a portion of the csv
custom.columns = names
custom.hist()
plt.show()
I have tried to read the file again from the csv
and I am getting the exact same error.
Edit:
print x_train
output:
[[0.0 0.0 0.0 0.0 0.0 0.0]
[1.0 1.0 0.0 0.0 0.0 0.0]
[0.0 0.0 0.0 0.0 0.0 0.0]
...,
[0.0 0.0 0.0 0.0 0.0 0.0]
[0.3333333333333333 0.3333333333333333 2.0 2.0 2.0 2.0]
[0.0 0.0 3.0 3.0 3.0 3.0]]
Edit2:
Complete list of errors(Traceback):
Traceback (most recent call last):
File "temp.py", line 104, in custom.dropna().hist()
File "/home/kostas/anaconda2/lib/python2.7/site-packages/pandas/tools/plotting.py", line 2893, in hist_frame layout=layout)
File "/home/kostas/anaconda2/lib/python2.7/site-packages/pandas/tools/plotting.py", line 3380, in _subplots ax0 = fig.add_subplot(nrows, ncols, 1, **subplot_kw)
File "/home/kostas/anaconda2/lib/python2.7/site-packages/matplotlib/figure.py", line 1005, in add_subplot a = subplot_class_factory(projection_class)(self, *args, **kwargs)
File "/home/kostas/anaconda2/lib/python2.7/site-packages/matplotlib/axes/_subplots.py", line 64, in init maxn=rows*cols, num=num))
I had the same problem, and I found that this was due to the fact that the NumPy array was an object array rather than a float array.
Try this:
x_train = x_train.astype(np.float)
So I'm pretty sure your issue is something to do with the format of the array train_x. I tried your program with an array of 10,000 rows and 6 cols and it worked fine so the issue is not size. For some reason, one of len(x_train)
or len(x_train[0])
is 0. What makes me think this is thus:
The ValueError you are getting is from the matplotlib.axes._subplot module which deals with drawing many small subplots within a big plot (so each small histogram). The code of the module is this:
"""
*rows*, *cols*, *num* are arguments where
the array of subplots in the figure has dimensions *rows*,
*cols*, and where *num* is the number of the subplot
being created. *num* starts at 1 in the upper left
corner and increases to the right.
"""
rows, cols, num = args
rows = int(rows)
cols = int(cols)
if isinstance(num, tuple) and len(num) == 2:
num = [int(n) for n in num]
self._subplotspec = GridSpec(rows, cols)[num[0] - 1:num[1]]
else:
if num < 1 or num > rows*cols:
raise ValueError(
"num must be 1 <= num <= {maxn}, not {num}".format(
maxn=rows*cols, num=num))
Your issue is in this part (see explanation in comments in code):
if num < 1 or num > rows*cols:
# maxN is the number of rows*cols and since this is showing 0 for you (in your error stacktrace),
# it means the number of cols being passed into your histogram is 0. Don't know why though :P
raise ValueError(
"num must be 1 <= num <= {maxn}, not {num}".format(
maxn=rows*cols, num=num))
I don't know how you are reading your input format, but I'm pretty sure the problem is related to it. If you set x_train to this it works fine:
x_train = [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[1.0, 1.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.3333333333333333, 0.3333333333333333, 2.0, 2.0, 2.0, 2.0],
[0.0, 0.0, 3.0, 3.0, 3.0, 3.0]]
Try doing this before calling the code in your question and see if that works:
x_train = list([list(x) for x in x_train])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With