scikit-learn

Question

I've create a pipeline as follows (using the Keras Scikit-Learn API)

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)

and fit it with

pipeline.fit(trainX,trainY)

If I predict with pipline.predict(testX), I (believe) I get standardised predictions.

How do I predict on testX so that predictedY it at the same scale as the actual (untouched) testY (i.e. NOT standardised prediction, but instead the actual values)? I see there is an inverse_transform method for Pipeline, however appears to be for only reverting a transformed X.

diogoncalves · Accepted Answer

Exactly. The StandardScaler() in a pipeline is only mapping the inputs (trainX) of pipeline.fit(trainX,trainY).

So, if you fit your model to approximate trainY and you need it to be standardized as well, you should map your trainY as

scalerY = StandardScaler().fit(trainY)  # fit y scaler
pipeline.fit(trainX, scalerY.transform(trainY))  # fit your pipeline to scaled Y
testY = scalerY.inverse_transform(pipeline.predict(testX))  # predict and rescale

The inverse_transform() function maps its values considering the standard deviation and mean calculated in StandardScaler().fit().

You can always fit your model without scaling Y, as you mentioned, but this can be dangerous depending on your data since it can lead your model to overfit. You have to test it ;)

scikit-learn - Convert pipeline prediction to original value/scale

Tags:

python

machine-learning

keras

data-science

andyandy

1 Answers

diogoncalves

Recent Activity

Donate For Us