Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, ValueError, BroadCast Error with SKLearn Preproccesing

I am trying to run SKLearn Preprocessing standard scaler function and I receive the following error:

from sklearn import preprocessing as pre
scaler = pre.StandardScaler().fit(t_train)
t_train_scale = scaler.transform(t_train)
t_test_scale = scaler.transform(t_test)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-149-c0133b7e399b> in <module>()
      4 scaler = pre.StandardScaler().fit(t_train)
      5 t_train_scale = scaler.transform(t_train)

----> 6 t_test_scale = scaler.transform(t_test)

C:\Users\****\Anaconda\lib\site-packages\sklearn\preprocessing\data.pyc in transform(self, X, y, copy)
    356         else:
    357             if self.with_mean:
--> 358                 X -= self.mean_
    359             if self.with_std:
    360                 X /= self.std_

ValueError: operands could not be broadcast together with shapes (40000,59) (119,) (40000,59) 

I understand the shapes do not match. The train and test data set are different lengths so how would I transform the data?

like image 608
user2977664 Avatar asked Sep 27 '22 08:09

user2977664


1 Answers

please print the output from t_train.shape[1] and t_test.shape[1]

StandardScaler expects any two datasets to have the same number of columns. I suspect earlier pre-processing (dropping columns, adding dummy columns, etc) is the source of your problem. Whatever transformations you make to the t_train also need to be made to t_test.

The error is telling you the information that I'm asking for:

ValueError: operands could not be broadcast together with shapes (40000,59) (119,) (40000,59)

I expect you'll find that t_train.shape[1] is 59 and t_test.shape[1] is 119. So you have 59 columns in your training dataset and 119 in your test dataset.

Did you remove any columns from the training set prior to attempting to use StandardScaler?

like image 167
Jason Wolosonovich Avatar answered Oct 19 '22 12:10

Jason Wolosonovich