Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the integer while setting the seed mean?

I want to randomly select n rows from my data set using the sample() function in R. I was getting different outputs each time and hence used set.seed() function to get the same output. I know that each integer in the set.seed() will give me a unique output and the output will be the same if set the same seed. But I'm not able to make out what that integer that is passed as a parameter to the set.seed() function means. Is it just an index that goes into the random generator algorithm or does it mean some part of the data from where you start sampling? For example, what does the 2 in set.seed(2) mean?

like image 525
Prateek Kulkarni Avatar asked Feb 04 '13 10:02

Prateek Kulkarni


People also ask

What does the number mean in set seed?

Let me explain in simple words, set seed (value) where value specifies the initial value of the random number seed. Syntax: set.seed(123) In the above line,123 is set as the random number value. The main point of using the seed is to be able to reproduce a particular sequence of 'random' numbers.

How do I choose a set seed number?

It's just down to the authors' choice. Further, if you are only ever setting the seed once in your code, then you can kind of choose any number you like.

What does seed value mean?

A seed value specifies a particular stream from a set of possible random number streams. When you specify a seed, SAS generates the same set of pseudorandom numbers every time you run the program.

What is the seed of a number?

In random number computation, a seed is an initial number used as the starting point in a random number generating algorithm.


2 Answers

In the old days, there were books that contained pages and pages of random digits (in a random order, of course).

I like to think of set.seed(x) as telling the computer to start reading random numbers from page x in a huge book of random numbers. x has nothing to do with the data, but how the algorithm for choosing random numbers should begin.

This might be a bit facile, but I like the analogy.

like image 118
Charlie Avatar answered Sep 20 '22 14:09

Charlie


A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator.

For a seed to be used in a pseudorandom number generator, it does not need to be random. Because of the nature of number generating algorithms, so long as the original seed is ignored, the rest of the values that the algorithm generates will follow probability distribution in a pseudorandom manner.

-- wikipedia

So, random function could be implemented like this:

int rand_r(unsigned int *seed)
{
    *seed = *seed * 1103515245 + 12345;
    return (*seed % ((unsigned int)RAND_MAX + 1));
}

(sample taken from glibc)

like image 20
kometonja Avatar answered Sep 22 '22 14:09

kometonja