sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()

Question

Just trying to do a simple linear regression but I'm baffled by this error for:

regr = LinearRegression()
regr.fit(df2.iloc[1:1000, 5].values, df2.iloc[1:1000, 2].values)

which produces:

ValueError: Found arrays with inconsistent numbers of samples: [  1 999]

These selections must have the same dimensions, and they should be numpy arrays, so what am I missing?

Yul · Accepted Answer

It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, ), it does not work. By using numpy.reshape(), you should change the shape of the array to (999, 1), e.g. using

data=data.reshape((999,1))

In my case, it worked with that.

data=data.reshape((999,1))

In my case, it worked with that.

user24981 · Answer

Looks like you are using pandas dataframe (from the name df2).

You could also do the following:

regr = LinearRegression()
regr.fit(df2.iloc[1:1000, 5].to_frame(), df2.iloc[1:1000, 2].to_frame())

NOTE: I have removed "values" as that converts the pandas Series to numpy.ndarray and numpy.ndarray does not have attribute to_frame().

sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()

Tags:

scikit-learn

sunny

2 Answers

Yul

user24981

Recent Activity

Donate For Us

sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()

Tags:

scikit-learn

sunny

2 Answers

Yul

user24981

Related questions

Recent Activity

Donate For Us