Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does seed do in random forest?

I know that seed is set in general is used so that we can reproduce the same result. But, what does setting up the seed actually do in random forest part. Does it change any of the arguments of randomForest() function in R like nTree or sampSize.

I am using different seeds for my random forest model each time, but want to know how different seeds affect a random forest model.

like image 231
Sowmya S. Manian Avatar asked Mar 30 '16 11:03

Sowmya S. Manian


People also ask

What does seed do in random?

Seed function is used to save the state of a random function, so that it can generate same random numbers on multiple executions of the code on the same machine or on different machines (for a specific seed value). The seed value is the previous value number generated by the generator.

What is set seed in random forest r?

set. seed=500 initializes a variable called set. seed and sets it to 500. It does not set the random number generator seed. Use set.

Why is seed 42?

It's a pop-culture reference! In Douglas Adams's popular 1979 science-fiction novel The Hitchhiker's Guide to the Galaxy, towards the end of the book, the supercomputer Deep Thought reveals that the answer to the great question of “life, the universe and everything” is 42.

What is the seed of a random number generator?

When you use statistical software to generate random numbers, you usually have an option to specify a random number seed. A seed is a positive integer that initializes a random-number generator (technically, a pseudorandom-number generator). A seed enables you to create reproducible streams of random numbers.


1 Answers

Trees grow from seeds and so do forests ;-) (scnr)

There are different ways to built a random forest, however, all in common is that multiple trees are built. To improve classification accuracy over a single decision tree, the individual trees in a random forest need to differ, as you would have nTree times the same tree. This difference is achieved by introducing randomness in the generation of the trees. The randomness is influenced by the seed and what is most important about the seed is that using the same seed should always generate the same result.

How does the randomness influence the tree build? There are multiple ways. - build the tree for a random subset. This is for each individual tree of the forest a subset of training example are drawn and then a tree is build for this subset - at each decision point in the tree, the decision attribute is selected randomly.

Often these two elements are combined.

http://link.springer.com/article/10.1023%2FA%3A1010933404324#page-1

like image 117
CAFEBABE Avatar answered Oct 14 '22 07:10

CAFEBABE