I have a panda dataframe and I want to randomly select several columns from it. And I want to select the same columns every time. I find there is a seed moduel for numpy.random but I do not know any similar application in pandas.
The numpy random seed is a numerical value that generates a new set or repeats pseudo-random numbers. The value in the numpy random seed saves the state of randomness. If we call the seed function using value 1 multiple times, the computer displays the same random numbers.
The seed() method is used to initialize the random number generator. The random number generator needs a number to start with (a seed value), to be able to generate a random number. By default the random number generator uses the current system time.
Seed is a global pseudo-random generator. However, randomstate is a pseudo-random generator isolated from others, which only impact specific variable.
You can use a parameter random_state. See example below taken from documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html
df['num_legs'].sample(n=3, random_state=1)
It will ensure that 3 random data will be used every time you run it. Then you can change the value random_state as you want
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With