I have data at two stages:
import numpy as np
data_pre = np.array([[1., 2., 203.],
                     [0.5, np.nan, 208.]])
data_post = np.array([[2., 2., 203.],
                      [0.5, 2., 208.]])
I also have two pre-existing fitted estimators:
from sklearn.preprocessing import Imputer
from sklearn.ensemble import GradientBoostingRegressor
imp = Imputer(missing_values=np.nan, strategy='mean', axis=1).fit(data_pre)
gbm = GradientBoostingRegressor().fit(data_post[:,:2], data_post[:,2])
I need to pass a fitted pipeline and data_pre to another function.  
def the_function_i_need(estimators):
    """
    """
    return fitted pipeline
fitted_pipeline = the_function_i_need([imp, gbm])
sweet_output = static_function(fitted_pipeline, data_pre) 
Is there a way to combine these two existing and fitted model objects into a fitted pipeline without refitting the models or am I out of luck?
I tried looking into this. I couldn't find any straightforward way to do this.
The only way I feel is to write a Custom Transformer which serves as a wrapper over the existing Imputer and GradientBoostingRegressor. You can initialize the wrapper with your already fitted Regressor and/or Imputer. You can then override the call to fit, by doing nothing in that. In all subsequent transform calls, you can call the transform of the underlying fitted model. This is a dirty way of doing this and should not be done until and unless this is very important to your application. A good tutorial on writing custom classes for Scikit-Learn Pipelines can be found here. Another working example of custom pipeline objects from scikit-learn's documentation can be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With