Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scikit-learn - Convert pipeline prediction to original value/scale

I've create a pipeline as follows (using the Keras Scikit-Learn API)

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)

and fit it with

pipeline.fit(trainX,trainY)

If I predict with pipline.predict(testX), I (believe) I get standardised predictions.

How do I predict on testX so that predictedY it at the same scale as the actual (untouched) testY (i.e. NOT standardised prediction, but instead the actual values)? I see there is an inverse_transform method for Pipeline, however appears to be for only reverting a transformed X.

like image 408
andyandy Avatar asked Jan 24 '17 19:01

andyandy


1 Answers

Exactly. The StandardScaler() in a pipeline is only mapping the inputs (trainX) of pipeline.fit(trainX,trainY).

So, if you fit your model to approximate trainY and you need it to be standardized as well, you should map your trainY as

scalerY = StandardScaler().fit(trainY)  # fit y scaler
pipeline.fit(trainX, scalerY.transform(trainY))  # fit your pipeline to scaled Y
testY = scalerY.inverse_transform(pipeline.predict(testX))  # predict and rescale

The inverse_transform() function maps its values considering the standard deviation and mean calculated in StandardScaler().fit().

You can always fit your model without scaling Y, as you mentioned, but this can be dangerous depending on your data since it can lead your model to overfit. You have to test it ;)

like image 87
diogoncalves Avatar answered Oct 26 '22 18:10

diogoncalves