I have a model built with Python scikit-learn. I understand that the models can be saved in Pickle or Joblib formats. Are there any existing methods out there to save the jobs in JSON format? Please see the model build code below for reference:
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names =['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))
Save your model Using JSON format we need indented output so we provide indent parameter and set it to 4. Save the JSON string to a file. Since the data serialization using JSON actually saves the object into a string format, rather than byte stream, the 'regressor_param.
JavaScript Object Notation (JSON) is a standardized format commonly used to transfer data as text that can be sent over a network. It's used by lots of APIs and Databases, and it's easy for both humans and machines to read. JSON represents objects as name/value pairs, just like a Python dictionary.
You'll have to cook up your own serialization/deserialization recipe. Fortunately, logistic regression can basically be captured by the coefficients and the intercept. However, the LogisticRegression
object keeps some other metadata around which we might as well capture. I threw together the following functions that does the dirty-work. Keep in mind, this is still rough:
import numpy as np
import json
from sklearn.linear_model import LogisticRegression
def logistic_regression_to_json(lrmodel, file=None):
if file is not None:
serialize = lambda x: json.dump(x, file)
else:
serialize = json.dumps
data = {}
data['init_params'] = lrmodel.get_params()
data['model_params'] = mp = {}
for p in ('coef_', 'intercept_','classes_', 'n_iter_'):
mp[p] = getattr(lrmodel, p).tolist()
return serialize(data)
def logistic_regression_from_json(jstring):
data = json.loads(jstring)
model = LogisticRegression(**data['init_params'])
for name, p in data['model_params'].items():
setattr(model, name, np.array(p))
return model
Note, with just 'coef_', 'intercept_','classes_'
you could do the predictions yourself, since logistic regression is a straight-forward linear model, it's simply matrix-multiplication.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With