On a fresh installation of Anaconda under Ubuntu... I am preprocessing my data in various ways prior to a classification task using Scikit-Learn.
from sklearn import preprocessing scaler = preprocessing.MinMaxScaler().fit(train) train = scaler.transform(train) test = scaler.transform(test)
This all works fine but if I have a new sample (temp below) that I want to classify (and thus I want to preprocess in the same way then I get
temp = [1,2,3,4,5,5,6,....................,7] temp = scaler.transform(temp)
Then I get a deprecation warning...
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
So the question is how should I be rescaling a single sample like this?
I suppose an alternative (not very good one) would be...
temp = [temp, temp] temp = scaler.transform(temp) temp = temp[0]
But I'm sure there are better ways.
In NumPy, -1 in reshape(-1) refers to an unknown dimension that the reshape() function calculates for you. It is like saying: “I will leave this dimension for the reshape() function to determine”. A common use case is to flatten a nested array of an unknown number of elements to a 1D array.
Reshape your data either using array. reshape(-1, 1) if your data has a single feature or array. reshape(1, -1) if it contains a single sample. We could change our Series into a NumPy array and then reshape it to have two dimensions.
Just listen to what the warning is telling you:
Reshape your data either X.reshape(-1, 1) if your data has a single feature/column and X.reshape(1, -1) if it contains a single sample.
For your example type(if you have more than one feature/column):
temp = temp.reshape(1,-1)
For one feature/column:
temp = temp.reshape(-1,1)
Well, it actually looks like the warning is telling you what to do.
As part of sklearn.pipeline
stages' uniform interfaces, as a rule of thumb:
when you see X
, it should be an np.array
with two dimensions
when you see y
, it should be an np.array
with a single dimension.
Here, therefore, you should consider the following:
temp = [1,2,3,4,5,5,6,....................,7] # This makes it into a 2d array temp = np.array(temp).reshape((len(temp), 1)) temp = scaler.transform(temp)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With