Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preprocessing in scikit learn - single sample - Depreciation warning

On a fresh installation of Anaconda under Ubuntu... I am preprocessing my data in various ways prior to a classification task using Scikit-Learn.

from sklearn import preprocessing  scaler = preprocessing.MinMaxScaler().fit(train) train = scaler.transform(train)     test = scaler.transform(test) 

This all works fine but if I have a new sample (temp below) that I want to classify (and thus I want to preprocess in the same way then I get

temp = [1,2,3,4,5,5,6,....................,7] temp = scaler.transform(temp) 

Then I get a deprecation warning...

DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17  and will raise ValueError in 0.19. Reshape your data either using  X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.  

So the question is how should I be rescaling a single sample like this?

I suppose an alternative (not very good one) would be...

temp = [temp, temp] temp = scaler.transform(temp) temp = temp[0] 

But I'm sure there are better ways.

like image 504
Chris Arthur Avatar asked Jan 29 '16 10:01

Chris Arthur


People also ask

What is the meaning of reshape (- 1 1?

In NumPy, -1 in reshape(-1) refers to an unknown dimension that the reshape() function calculates for you. It is like saying: “I will leave this dimension for the reshape() function to determine”. A common use case is to flatten a nested array of an unknown number of elements to a 1D array.

How do you reshape a single feature in Python?

Reshape your data either using array. reshape(-1, 1) if your data has a single feature or array. reshape(1, -1) if it contains a single sample. We could change our Series into a NumPy array and then reshape it to have two dimensions.


2 Answers

Just listen to what the warning is telling you:

Reshape your data either X.reshape(-1, 1) if your data has a single feature/column and X.reshape(1, -1) if it contains a single sample.

For your example type(if you have more than one feature/column):

temp = temp.reshape(1,-1)  

For one feature/column:

temp = temp.reshape(-1,1) 
like image 90
Mike Avatar answered Oct 03 '22 03:10

Mike


Well, it actually looks like the warning is telling you what to do.

As part of sklearn.pipeline stages' uniform interfaces, as a rule of thumb:

  • when you see X, it should be an np.array with two dimensions

  • when you see y, it should be an np.array with a single dimension.

Here, therefore, you should consider the following:

temp = [1,2,3,4,5,5,6,....................,7] # This makes it into a 2d array temp = np.array(temp).reshape((len(temp), 1)) temp = scaler.transform(temp) 
like image 44
Ami Tavory Avatar answered Oct 03 '22 04:10

Ami Tavory