Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Web application that uses scikit-learn

I have locally trained a sklearn classifier and I have to create a simple web application that demonstrate its use. I'm a complete noob on web app development and I don't want to waste hours on creating a web app using a framework that doesn't support the modules I'm using.

  1. What do you suggest would be a good approach for this task?
  2. What web app development framework should I use (if any)?
  3. Do I have to dive into things like Heroku , django etc. or is there more simple and quicker solutions for a simple scientific demo?

My thought was to take the classifier I trained, pickle it and un-pickle it on the server, then to run classify from the server, but I'm not sure where to begin.

like image 316
zenpoy Avatar asked Jul 22 '12 12:07

zenpoy


People also ask

Where can I use Scikit-learn?

Scikit-learn is an indispensable part of the Python machine learning toolkit at JPMorgan. It is very widely used across all parts of the bank for classification, predictive analytics, and very many other machine learning tasks.

Is Scikit-learn used in industry?

Sklearn is an open source library which uses the BSD license. It is widely used in industry as well as in academia. It is built on Numpy, Scipy and Matplotlib while also having wrappers around various popular libraries such LIBSVM. Sklearn can be used “out of the box” after installation.

Is PyTorch better than Scikit?

PyTorch vs Scikit-Learn However, while Sklearn is mostly used for machine learning, PyTorch is designed for deep learning. Sklearn is good for defining algorithms, but cannot really be used for end-to-end training of deep neural networks. Ease of Use: Undoubtedly Sklearn is easier to use than PyTorch.

What is the difference between sklearn and Scikit-learn?

scikit-learn and sklearn both refer to the same package however, there are a couple of things you need to be aware of. Firstly, you can install the package by using either of scikit-learn or sklearn identifiers however, it is recommended to install scikit-learn through pip using the skikit -learn identifier.


6 Answers

If this is just for a demo, train your classifier offline, pickle the model and then use a simple python web framework such as flask or bottle to unpickle the model at server startup time and call the predict function in an HTTP request handler.

django is a feature complete framework hence is longer to learn than flask or bottle but it has a great documentation and a larger community.

heroku is a service to host your application in the cloud. It's possible to host flask applications on heroku, here is a simple template project + instructions to do so.

For "production" setups I would advise you not to use pickle but to write your own persistence layer for the machine learning model so as to have full control on the parameters your store and be more robust to library upgrades that might break the unpickling of old models.

like image 113
ogrisel Avatar answered Sep 27 '22 22:09

ogrisel


While this is not a classifier, I have implemented a simple machine learning web service using the bottle framework and scikit-learn. Given a dataset in .csv format it returns 2D visualizations with respect to principal components analysis and linear discriminant analysis techniques.

More information and example data files can be found at: http://mindwriting.org/blog/?p=153

Here is the implementation: upload.html:

<form
 action="/plot" method="post"
 enctype="multipart/form-data"
>
Select a file: <input type="file" name="upload" />
<input type="submit" value="PCA & LDA" />
</form>

pca_lda_viz.py (modify host name and port number):

import matplotlib
matplotlib.use('Agg')

import matplotlib.pyplot as plt
import numpy as np
from cStringIO import StringIO

from bottle import route, run, request, static_file
import csv
from matplotlib.font_manager import FontProperties
import colorsys

from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.lda import LDA

html = '''
<html>
    <body>
        <img src="data:image/png;base64,{}" />
    </body>
</html>
'''

 @route('/')
 def root():
     return static_file('upload.html', root='.')

 @route('/plot', method='POST')
    def plot():

       # Get the data
       upload = request.files.get('upload')
       mydata = list(csv.reader(upload.file, delimiter=','))

       x = [row[0:-1] for row in mydata[1:len(mydata)]]

       classes =  [row[len(row)-1] for row in mydata[1:len(mydata)]]
       labels = list(set(classes))
       labels.sort()

       classIndices = np.array([labels.index(myclass) for myclass in classes])

       X = np.array(x).astype('float')
       y = classIndices
       target_names = labels

       #Apply dimensionality reduction
       pca = PCA(n_components=2)
       X_r = pca.fit(X).transform(X)

       lda = LDA(n_components=2)
       X_r2 = lda.fit(X, y).transform(X)

        #Create 2D visualizations
       fig = plt.figure()
       ax=fig.add_subplot(1, 2, 1)
       bx=fig.add_subplot(1, 2, 2)

       fontP = FontProperties()
       fontP.set_size('small')

       colors = np.random.rand(len(labels),3)

       for  c,i, target_name in zip(colors,range(len(labels)), target_names):
           ax.scatter(X_r[y == i, 0], X_r[y == i, 1], c=c, 
                      label=target_name,cmap=plt.cm.coolwarm)
           ax.legend(loc='upper center', bbox_to_anchor=(1.05, -0.05),
                     fancybox=True,shadow=True, ncol=len(labels),prop=fontP)
           ax.set_title('PCA')
           ax.tick_params(axis='both', which='major', labelsize=6)

       for c,i, target_name in zip(colors,range(len(labels)), target_names):
           bx.scatter(X_r2[y == i, 0], X_r2[y == i, 1], c=c, 
                      label=target_name,cmap=plt.cm.coolwarm)
           bx.set_title('LDA');
           bx.tick_params(axis='both', which='major', labelsize=6)

       # Encode image to png in base64
       io = StringIO()
       fig.savefig(io, format='png')
       data = io.getvalue().encode('base64')

       return html.format(data)

run(host='mindwriting.org', port=8079, debug=True)
like image 23
user3707687 Avatar answered Sep 27 '22 23:09

user3707687


You can follow the tutorial below to deploy your scikit-learn model in Azure ML and get the web service automatically generated:

Build and Deploy a Predictive Web App Using Python and Azure ML

or the combination of yHat + Heroku may also do the trick

like image 29
leo9r Avatar answered Sep 27 '22 22:09

leo9r


I'm working on a Docker image that wraps predict and predictproba methods and expose them as a web api: https://github.com/hexacta/docker-sklearn-predict-http-api

You need to save your model:

from sklearn.externals import joblib
joblib.dump(clf, 'iris-svc.pkl')

create a Dockerfile:

FROM hexacta/sklearn-predict-http-api:latest
COPY iris-svc.pkl /usr/src/app/model.pkl

and run the container:

$ docker build -t iris-svc .
$ docker run -d -p 4000:8080 iris-svc

then you can make requests:

$ curl -H "Content-Type: application/json" -X POST -d '{"sepal length (cm)":4.4}' http://localhost:4000/predictproba
  [{"0":0.8284069169,"1":0.1077571623,"2":0.0638359208}]
$ curl -H "Content-Type: application/json" -X POST -d '[{"sepal length (cm)":4.4}, {"sepal length (cm)":15}]' http://localhost:4000/predict
  [0, 2]
like image 41
pomber Avatar answered Sep 27 '22 22:09

pomber


You can use Plotly Dash for a demo or even for an app with limited scope.

https://dash-gallery.plotly.host/Portal/ for some examples with code source. You have machine learning examples with sklearn.

https://dash.plotly.com/deployment for deployment, mainly with Heroku.

like image 21
SoufianeK Avatar answered Sep 28 '22 00:09

SoufianeK


If you go the flask route, I highly recommend that you watch the Corey Shafer series on Youtube. It's a solid series that will get you underway quickly, and there are many helpful notes from other viewers in the comment section.

Additionally, since I presume you'll build your models elsewhere and look to score them on your site, you will likely want to use pickle to store the model objects after development, and then load the model objects using pickle within your flask config.py.

like image 35
C. Cooney Avatar answered Sep 27 '22 22:09

C. Cooney