I have a simple CSV with two columns:
I read the CSV data into a pandas dataframe, like this:
df = pd.read_csv("Errors.csv", sep=",")
df.head() shows:
ErrorWeek ErrorCount
0 1 80
1 2 118
2 3 249
3 4 397
4 5 159
So far so good.
Then, I create a test/train split, like this:
X_train, X_test, y_train, y_test = train_test_split(
df['ErrorWeek'], df['ErrorCount'], random_state=0)
No errors so far.
But, I then create a linear regression object and try to fit the data.
# Create linear regression object
regr = linear_model.LinearRegression()
# Train the model using the training sets
regr.fit(X_train, y_train)
Here I do get an error: "Reshape your data either using array.reshape(-1, 1)"
--
Looking at the shape of X_Test and y_Test, I get what looks like two one dimensional "arrays":
X_train shape: (36,)
y_train shape: (36,)
--
I have spent many hours trying to figure this out, but I'm new to Pandas, Python, and to scikit-learn.
I'm reading in two dimensional data, but Pandas isn't seeing that way.
What do I need to do, specifically?
Thanks,
Doing:
X_train, X_test, y_train, y_test = train_test_split(
df['ErrorWeek'], df['ErrorCount'], random_state=0)
will make all output arrays of one dimension because you are choosing a single column value for X and y.
Now, when you pass a one dimensional array of [n,], Scikit-learn is not able to decide that what you have passed is one row of data with multiple columns, or multiple samples of data with single column. i.e. sklearn may not infer whether its n_samples=n and n_features=1 or other way around (n_samples=1 and n_features=n) based on X data alone.
Hence it asks you reshape the 1-D data you provided to a 2-d data of shape [n_samples, n_features]
Now there are multiple ways of doing this.
You can do what the scikit-learn says:
X_train = X_train.reshape(-1,1) X_test = X_test.reshape(-1,1)
The 1 in the second place of reshape tells that there is a single column only and -1 is to detect the number of rows automatically for this single column.
change your fit
part
regr.fit(X_train[:,None], y_train)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With