how to set threshold to scikit learn random forest model

Tags:

After seeing the precision_recall_curve, if I want to set threshold = 0.4, how to implement 0.4 into my random forest model (binary classification), for any probability <0.4, label it as 0, for any >=0.4, label it as 1.

from sklearn.ensemble import RandomForestClassifier
  random_forest = RandomForestClassifier(n_estimators=100, oob_score=True, random_state=12)
  random_forest.fit(X_train, y_train)
from sklearn.metrics import accuracy_score
  predicted = random_forest.predict(X_test)
accuracy = accuracy_score(y_test, predicted)

Documentation Precision recall

818

asked Apr 11 '18 23:04

BigData

2 Answers

Assuming you are doing binary classification, it's quite easy:

threshold = 0.4

predicted_proba = random_forest.predict_proba(X_test)
predicted = (predicted_proba [:,1] >= threshold).astype('int')

accuracy = accuracy_score(y_test, predicted)

answered Oct 09 '22 15:10

Stev

random_forest = RandomForestClassifier(n_estimators=100)
random_forest.fit(X_train, y_train)

threshold = 0.4

predicted = random_forest.predict_proba(X_test)
predicted[:,0] = (predicted[:,0] < threshold).astype('int')
predicted[:,1] = (predicted[:,1] >= threshold).astype('int')


accuracy = accuracy_score(y_test, predicted)
print(round(accuracy,4,)*100, "%")

this comes with an error refers to the last accuracy part" ValueError: Can't handle mix of binary and multilabel-indicator"

answered Oct 09 '22 15:10

BigData

Related questions
                            
                                "\n" in strings not working
                            
                                'str' object is not callable Django Rest Framework
                            
                                Combine two strings (char by char) and repeat last char of shortest one
                            
                                convert requests.models.Response to Django HttpResponse
                            
                                Add custom button to django admin panel
                            
                                How to use Python `secret` module to generate random integer?
                            
                                Get schema of parquet file in Python
                            
                                Rename duplicated index values pandas DataFrame
                            
                                Process pandas dataframe into violinplot
                            
                                Python cannot find package h2o in anaconda
                            
                                PyQt combo box change value of a label [closed]
                            
                                boto3 file_upload does it check if file exists
                            
                                virtualenv on Windows10 gives error:The path python3 does not exist
                            
                                series.unique vs list of set - performance
                            
                                AttributeError:'LinearSVC' object has no attribute 'predict_proba'
                            
                                How to read Python source code directly from IDE
                            
                                Pyspark dataframe how to drop rows with nulls in all columns?
                            
                                Django Admin DateTimeField Showing 24hr format time
                            
                                More effective way to use pandas get_loc?
                            
                                Search and replace dots and commas in pandas dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to set threshold to scikit learn random forest model

Tags:

python

scikit-learn

BigData

People also ask

2 Answers

Stev

BigData

Recent Activity

Donate For Us