Just trying to do a simple linear regression but I'm baffled by this error for:
regr = LinearRegression()
regr.fit(df2.iloc[1:1000, 5].values, df2.iloc[1:1000, 2].values)
which produces:
ValueError: Found arrays with inconsistent numbers of samples: [ 1 999]
These selections must have the same dimensions, and they should be numpy arrays, so what am I missing?
It looks like sklearn requires the data shape of (row number, column number).
If your data shape is (row number, ) like (999, )
, it does not work.
By using numpy.reshape()
, you should change the shape of the array to (999, 1)
, e.g. using
data=data.reshape((999,1))
In my case, it worked with that.
Looks like you are using pandas dataframe (from the name df2).
You could also do the following:
regr = LinearRegression()
regr.fit(df2.iloc[1:1000, 5].to_frame(), df2.iloc[1:1000, 2].to_frame())
NOTE: I have removed "values" as that converts the pandas Series to numpy.ndarray and numpy.ndarray does not have attribute to_frame().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With