I was trying out a ML example and it worked for the most part but when I ran the code consecutively python started spitting out different prediction results, now I am now ML expert but that seems wack?
# Example file from Google Developers: "Hello World - Machine Learning Recipes": YouTube: https://youtu.be/cKxRvEZd3Mw
# Category: Supervised Learning
# January 14, 2018
from sklearn import tree
# Declarations: Texture
bumpy = 0
smooth = 1
# Declarations: Labels
apple = 0
orange = 1
# Step(1): Collect training data
# Features: [Weight, Texture]
features = [[140, smooth], [130, smooth], [150, bumpy], [170, bumpy]]
# labels will be used as the index for the features
labels = [apple, apple, orange, orange]
# Step(2): Train Classifier: Decision Tree
# Use the decision tree object and then fit 'find' paterns in features and labels
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)
# Step(3): Make Predictions
# the prdict method will return the best fit from the decesion tree
result = clf.predict([[150, bumpy], [130, smooth], [125.5, bumpy], [110, smooth]])
# result = clf.predict([[150, bumpy]])
print("Step(3): Make Predictions: ")
for x in result:
if x == 0:
print("Apple")
continue
elif x == 1:
print("Orange")
continue
print("Orange")
Click link to see vim and bash windows
There's an element of randomness to (most?) Decision Tree algorithms, and your training set is very small which might be exaggerating the effect. The randomness is typically used to determine how many/which samples to use, and in your case there are very few samples.
Try setting the random_state to some fixed integer when you create the DecisionTreeClassifier. If you want a repeatable result for testing, you'll need to use the same "seed" value each time. They use a random seed of zero in the example docs:
clf = DecisionTreeClassifier(random_state=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With