Python

Question

I am doing some machine learning task on Python. I need to build RandomForest and then build a graph that will show how the quality of the training and test samples depends on the number of trees in the Random Forest. Is it necessary to build a new Random Forest each time with a certain number of trees? Or I can somehow iteratively add trees (if it possible, can you give the example of code how to do that)?

ldirer · Accepted Answer

You can use the warm start parameter of the RandomForestClassifier to do just that.

Here's an example you can adapt to your specific needs:

errors = []
growing_rf = RandomForestClassifier(n_estimators=10, n_jobs=-1,  
                                    warm_start=True, random_state=1514)
for i in range(40):
    growing_rf.fit(X_train, y_train)
    growing_rf.n_estimators += 10
    errors.append(log_loss(y_valid, growing_rf.predict_proba(X_valid)))

_ = plt.plot(errors, '-r')

Here's what I got:

The learning curve I got

Python - Random Forest - Iteratively adding trees

Tags:

scikit-learn

random-forest

Alcyone

1 Answers

ldirer

Recent Activity

Donate For Us