Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python scikit-learn to JSON

I have a model built with Python scikit-learn. I understand that the models can be saved in Pickle or Joblib formats. Are there any existing methods out there to save the jobs in JSON format? Please see the model build code below for reference:

import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names =['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))
like image 651
user1124702 Avatar asked Jan 18 '18 18:01

user1124702


People also ask

How do you save Sklearn model JSON?

Save your model Using JSON format we need indented output so we provide indent parameter and set it to 4. Save the JSON string to a file. Since the data serialization using JSON actually saves the object into a string format, rather than byte stream, the 'regressor_param.

What is JSON () in Python?

JavaScript Object Notation (JSON) is a standardized format commonly used to transfer data as text that can be sent over a network. It's used by lots of APIs and Databases, and it's easy for both humans and machines to read. JSON represents objects as name/value pairs, just like a Python dictionary.


1 Answers

You'll have to cook up your own serialization/deserialization recipe. Fortunately, logistic regression can basically be captured by the coefficients and the intercept. However, the LogisticRegression object keeps some other metadata around which we might as well capture. I threw together the following functions that does the dirty-work. Keep in mind, this is still rough:

import numpy as np
import json
from sklearn.linear_model import LogisticRegression

def logistic_regression_to_json(lrmodel, file=None):
    if file is not None:
        serialize = lambda x: json.dump(x, file)
    else:
        serialize = json.dumps
    data = {}
    data['init_params'] = lrmodel.get_params()
    data['model_params'] = mp = {}
    for p in ('coef_', 'intercept_','classes_', 'n_iter_'):
        mp[p] = getattr(lrmodel, p).tolist()
    return serialize(data)

def logistic_regression_from_json(jstring):
    data = json.loads(jstring)
    model = LogisticRegression(**data['init_params'])
    for name, p in data['model_params'].items():
        setattr(model, name, np.array(p))
    return model

Note, with just 'coef_', 'intercept_','classes_' you could do the predictions yourself, since logistic regression is a straight-forward linear model, it's simply matrix-multiplication.

like image 124
juanpa.arrivillaga Avatar answered Sep 24 '22 01:09

juanpa.arrivillaga