Python scikit-learn to JSON

Tags:

I have a model built with Python scikit-learn. I understand that the models can be saved in Pickle or Joblib formats. Are there any existing methods out there to save the jobs in JSON format? Please see the model build code below for reference:

import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names =['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))

651

asked Jan 18 '18 18:01

user1124702

1 Answers

You'll have to cook up your own serialization/deserialization recipe. Fortunately, logistic regression can basically be captured by the coefficients and the intercept. However, the LogisticRegression object keeps some other metadata around which we might as well capture. I threw together the following functions that does the dirty-work. Keep in mind, this is still rough:

import numpy as np
import json
from sklearn.linear_model import LogisticRegression

def logistic_regression_to_json(lrmodel, file=None):
    if file is not None:
        serialize = lambda x: json.dump(x, file)
    else:
        serialize = json.dumps
    data = {}
    data['init_params'] = lrmodel.get_params()
    data['model_params'] = mp = {}
    for p in ('coef_', 'intercept_','classes_', 'n_iter_'):
        mp[p] = getattr(lrmodel, p).tolist()
    return serialize(data)

def logistic_regression_from_json(jstring):
    data = json.loads(jstring)
    model = LogisticRegression(**data['init_params'])
    for name, p in data['model_params'].items():
        setattr(model, name, np.array(p))
    return model

Note, with just 'coef_', 'intercept_','classes_' you could do the predictions yourself, since logistic regression is a straight-forward linear model, it's simply matrix-multiplication.

124

answered Sep 24 '22 01:09

juanpa.arrivillaga

Related questions
                            
                                ffmpeg installation on macOS for MoviePy fails with SSL error
                            
                                Querying "like" in pymongo [duplicate]
                            
                                Drop if all entries in a spark dataframe's specific column is null
                            
                                How to automatically detect columns that contain datetime in a pandas dataframe
                            
                                Why do pandas and dask perform better when importing from CSV compared to HDF5?
                            
                                Python numpy equivalent of R rep and rep_len functions
                            
                                Cython compilation error "Not allowed in a constant expression"
                            
                                How to import models from one app to another app in Django?
                            
                                Python Dictionary: "in" vs "get"
                            
                                how to set the position of a tkinter window without setting the dimensions
                            
                                Passing extra arguments to scrapy.Request()
                            
                                Django DRF - What's the use of serializers?
                            
                                Conversion of image type int16 to uint8
                            
                                Unable to install nltk using pip
                            
                                Convert image to array for CNN
                            
                                Run process as admin with subprocess.run in python
                            
                                IPython Console in Spyder(Anaconda) is truncating output
                            
                                Standardization/Normalization test data in Python
                            
                                how to get covariance matrix in tensorflow?
                            
                                What's the meaning of cv2.videoCapture.release()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python scikit-learn to JSON

Tags:

python

json

scikit-learn

logistic-regression

user1124702

People also ask

1 Answers

juanpa.arrivillaga

Recent Activity

Donate For Us