I have a data set with 36k rows. I want to randomly select 9k rows from it using pandas. How do I accomplish this task?
I think you can use sample
- 9k
or 25%
rows:
df.sample(n=9000)
Or:
df.sample(frac=0.25)
Another solution with creating random sample of index
by numpy.random.choice
and then select by loc
- index
has to be unique:
df = df.loc[np.random.choice(df.index, size=9000)]
Solution if not unique index:
df = df.iloc[np.random.choice(np.arange(len(df)), size=9000)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With