Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sklearn predict multiple outputs

I wrote the following code:

from sklearn import tree

# Dataset & labels
# Using metric units
# features = [height, weight, style]
styles = ['modern', 'classic']
features = [[1.65, 65, 1], 
            [1.55, 50, 1],
            [1.76, 64, 0],
            [1.68, 77, 0] ]
labels = ['Yellow dress', 'Red dress', 'Blue dress', 'Green dress']

# Decision Tree
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)

# Returns the dress
height = input('Height: ')
weight = input('Weight: ')
style = input('Modern [0] or Classic [1]: ')
print(clf.predict([[height,weight,style]]))

This code receives the user's height and weight, then returns the dress that better fits to her. Is there a way to return multiple options? For instance, return two or more dresses.

UPDATE

from sklearn import tree
import numpy as np

# Dataset & labels
# features = [height, weight, style]
# styles = ['modern', 'classic']
features = [[1.65, 65, 1], 
            [1.55, 50, 1],
            [1.76, 64, 1],
            [1.72, 68, 0],
            [1.73, 68, 0],
            [1.68, 77, 0]]
labels =    ['Yellow dress',
            'Red dress',
            'Blue dress',
            'Green dress',
            'Purple dress',
            'Orange dress']

# Decision Tree
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)

# Returns the dress
height = input('Height: ')
weight = input('Weight: ')
style = input('Modern [0] or Classic [1]: ')

print(clf.predict_proba([[height,weight,style]]))

If the user is 1.72m and 68kg, I want to show both the green and the purple dresses. This example just returns 100% for the green dress.

like image 916
bodruk Avatar asked Nov 23 '16 17:11

bodruk


2 Answers

Yes you can. Actually what you can do is that you can get the probability of each class. There is a function called .predict_proba() that is implemented in some classifiers.

See here, the documentation of sklearn.

It will return the probability of membership of your sample for each class.

Then you can for example return the labels associated with the two, three highest probabilities.

like image 139
MMF Avatar answered Sep 21 '22 16:09

MMF


predict() will return only the class with higher probability. If you use predict_proba() instead, it will return an array with the probability for each class, so you can pick the ones above a certain threshold, for instance.

Here is the documentation for the method.

You could do something like this with it:

probs = clf.predict_proba([[height, weight, style]])
threshold = 0.25 # change this accordingly
for index, prob in enumerate(probs[0]):
    if prob > threshold:
        print(styles[index]) 
like image 32
Arthur Camara Avatar answered Sep 20 '22 16:09

Arthur Camara