I am a newbie to machine learning. I have been struggling with a problem for a few weeks now and I hope someone can help here:
I have a data set with one continuous variable and the rest are categorical. I managed to encode the categorical variables and would like to build a multi output classifier.
Here is my data set: Snapshot of my data set I have these features : A, B I would like to predict : C,D,E,F, G
The data set looks like this: A,B,C,D,E,F,G
I spent days on the documentation on multi-output classifiers on the scikitlearn and here but none of the documentation seems clear to me.
Can anyone please point me in the right direction to find some sample code on how to create the classifier and predict with some sample data?
Thank you in advance P.S: I am not using TensorFlow and would appreciate your help for sklearn.
This is called multi-task learning, which basically means a model that learns several functions, but shares (some or all) weights. It's fairly common, for example one model for image recognition and detection. What you need to do is to define several loss functions (they are called heads).
Here's a very simple example in tensorflow that learns Y1
and Y2
from X
(from this post series):
# Define the Placeholders
X = tf.placeholder("float", [10, 10], name="X")
Y1 = tf.placeholder("float", [10, 1], name="Y1")
Y2 = tf.placeholder("float", [10, 1], name="Y2")
# Define the weights for the layers
shared_layer_weights = tf.Variable([10,20], name="share_W")
Y1_layer_weights = tf.Variable([20,1], name="share_Y1")
Y2_layer_weights = tf.Variable([20,1], name="share_Y2")
# Construct the Layers with RELU Activations
shared_layer = tf.nn.relu(tf.matmul(X,shared_layer_weights))
Y1_layer = tf.nn.relu(tf.matmul(shared_layer,Y1_layer_weights))
Y2_layer_weights = tf.nn.relu(tf.matmul(shared_layer,Y2_layer_weights))
# Calculate Loss
Y1_Loss = tf.nn.l2_loss(Y1,Y1_layer)
Y2_Loss = tf.nn.l2_loss(Y2,Y2_layer)
If you wish to code in pure scikit, see sklearn.multiclass
package, they support multioutput classification and multioutput regression. Here's an example of multioutput regression:
>>> from sklearn.datasets import make_regression
>>> from sklearn.multioutput import MultiOutputRegressor
>>> from sklearn.ensemble import GradientBoostingRegressor
>>> X, y = make_regression(n_samples=10, n_targets=3, random_state=1)
>>> MultiOutputRegressor(GradientBoostingRegressor(random_state=0)).fit(X, y).predict(X)
array([[-154.75474165, -147.03498585, -50.03812219],
[ 7.12165031, 5.12914884, -81.46081961],
[-187.8948621 , -100.44373091, 13.88978285],
[-141.62745778, 95.02891072, -191.48204257],
[ 97.03260883, 165.34867495, 139.52003279],
[ 123.92529176, 21.25719016, -7.84253 ],
[-122.25193977, -85.16443186, -107.12274212],
[ -30.170388 , -94.80956739, 12.16979946],
[ 140.72667194, 176.50941682, -17.50447799],
[ 149.37967282, -81.15699552, -5.72850319]])
[Update]
Here's a complete code that does multi-target classification. Try to run it:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier
# The data from your screenshot
# A B C D E F G
train_data = np.array([
[5, 133.5, 27, 284, 638, 31, 220],
[5, 111.9, 27, 285, 702, 36, 230],
[5, 99.3, 25, 310, 713, 39, 227],
[5, 102.5, 25, 311, 670, 34, 218],
[5, 114.8, 25, 312, 685, 34, 222],
])
# These I just made up
test_data_x = np.array([
[5, 100.0],
[5, 105.2],
[5, 102.7],
[5, 103.5],
[5, 120.3],
[5, 132.5],
[5, 152.5],
])
x = train_data[:, :2]
y = train_data[:, 2:]
forest = RandomForestClassifier(n_estimators=100, random_state=1)
classifier = MultiOutputClassifier(forest, n_jobs=-1)
classifier.fit(x, y)
print classifier.predict(test_data_x)
Output (well, looks reasonable to me):
[[ 25. 310. 713. 39. 227.]
[ 25. 311. 670. 34. 218.]
[ 25. 311. 670. 34. 218.]
[ 25. 311. 670. 34. 218.]
[ 25. 312. 685. 34. 222.]
[ 27. 284. 638. 31. 220.]
[ 27. 284. 638. 31. 220.]]
If for some reason this doesn't work or can't be applied in your case, please update the question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With