scikit-learn: how to scale back the 'y' predicted result

Tags:

I'm trying to learn scikit-learn and Machine Learning by using the Boston Housing Data Set.

# I splitted the initial dataset ('housing_X' and 'housing_y')
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(housing_X, housing_y, test_size=0.25, random_state=33)

# I scaled those two datasets
from sklearn.preprocessing import StandardScaler
scalerX = StandardScaler().fit(X_train)
scalery = StandardScaler().fit(y_train)
X_train = scalerX.transform(X_train)
y_train = scalery.transform(y_train)
X_test = scalerX.transform(X_test)
y_test = scalery.transform(y_test)

# I created the model
from sklearn import linear_model
clf_sgd = linear_model.SGDRegressor(loss='squared_loss', penalty=None, random_state=42) 
train_and_evaluate(clf_sgd,X_train,y_train)

Based on this new model clf_sgd, I am trying to predict the y based on the first instance of X_train.

X_new_scaled = X_train[0]
print (X_new_scaled)
y_new = clf_sgd.predict(X_new_scaled)
print (y_new)

However, the result is quite odd for me (1.34032174, instead of 20-30, the range of the price of the houses)

[-0.32076092  0.35553428 -1.00966618 -0.28784917  0.87716097  1.28834383
  0.4759489  -0.83034371 -0.47659648 -0.81061061 -2.49222645  0.35062335
 -0.39859013]
[ 1.34032174]

I guess that this 1.34032174 value should be scaled back, but I am trying to figure out how to do it with no success. Any tip is welcome. Thank you very much.

706

asked Jun 27 '16 16:06

2 Answers

You can use inverse_transform using your scalery object:

y_new_inverse = scalery.inverse_transform(y_new)

151

answered Oct 26 '22 23:10

Bit late to the game: Just don't scale your y. With scaling y you actually loose your units. The regression or loss optimization is actually determined by the relative differences between the features. BTW for house prices (or any other monetary value) it is common practice to take the logarithm. Then you obviously need to do an numpy.exp() to get back to the actual dollars/euros/yens...

answered Oct 26 '22 23:10

Maartenk

Related questions
                            
                                Wrapping python doctest results that are longer than 80 characters
                            
                                How to catch an exception in the for loop iterator
                            
                                Jupyter Notebook: interactive plot with widgets
                            
                                Documenting `tuple` return type in a function docstring for PyCharm type hinting
                            
                                What causes the error "_pickle.UnpicklingError: invalid load key, ' '."?
                            
                                matplotlib bar chart with dates
                            
                                logging setLevel, how it works
                            
                                sqlalchemy easy way to insert or update?
                            
                                How to find the line that is generating a Pandas SettingWithCopyWarning?
                            
                                How to extend Python class init
                            
                                Python LRU Cache Decorator Per Instance
                            
                                Import a Python library from Github
                            
                                How do I change the range of the x-axis with datetimes in matplotlib?
                            
                                ValueError: max() arg is an empty sequence
                            
                                Viewing the content of a Spark Dataframe Column
                            
                                gaierror: [Errno 8] nodename nor servname provided, or not known (with macOS Sierra)
                            
                                Shared variable in python's multiprocessing
                            
                                How to get tkinter canvas to dynamically resize to window width?
                            
                                How do I do a bitwise Not operation in Python?
                            
                                Pandas groupby with bin counts

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

scikit-learn: how to scale back the 'y' predicted result

Tags:

python

scale

machine-learning

scikit-learn

Hookstark

People also ask

2 Answers

Ryan

Maartenk

Recent Activity

Donate For Us