How to assign random values from a list to a column in a pandas dataframe?

Tags:

I am working with Python in Bigquery and have a large dataframe df (circa 7m rows). I also have a list lst that holds some dates (say all days in a given month).

I am trying to create an additional column "random_day" in df with a random value from lst in each row.

I tried running a loop and apply function but being quite a large dataset it is proving challenging.

My attempts passed by the loop solution:

df["rand_day"] = ""

for i in a["row_nr"]:
  rand_day = sample(day_list,1)[0]
  df.loc[i,"rand_day"] = rand_day

And the apply solution, defining first my function and then calling it:

def random_day():
  rand_day = sample(day_list,1)[0]
  return day

df["rand_day"] = df.apply(lambda row: random_day())

Any tips on this? Thank you

1000

asked Jan 25 '19 14:01

Jo Costa

1 Answers

Use numpy.random.choice and if necessary convert dates by to_datetime:

df = pd.DataFrame({
        'A':list('abcdef'),
        'B':[4,5,4,5,5,4],
})

day_list = pd.to_datetime(['2015-01-02','2016-05-05','2015-08-09'])
#alternative
#day_list = pd.DatetimeIndex(['2015-01-02','2016-05-05','2015-08-09'])

df["rand_day"] = np.random.choice(day_list, size=len(df))
print (df)
   A  B   rand_day
0  a  4 2016-05-05
1  b  5 2016-05-05
2  c  4 2015-08-09
3  d  5 2015-01-02
4  e  5 2015-08-09
5  f  4 2015-08-09

178

answered Sep 20 '22 15:09

jezrael

Related questions
                            
                                how do I cluster a list of geographic points by distance?
                            
                                Convert column suffixes from pandas join into a MultiIndex
                            
                                Plotting a Model created with PyMC3 as a graph
                            
                                Detect if class was defined declarative or functional - possible?
                            
                                Filtering pandas dataframe by day
                            
                                Doing the equivalent of log_struct in python logger
                            
                                FutureWarning with distplot in seaborn [duplicate]
                            
                                Django Autocomplete Light create new choice
                            
                                Convert Pytorch Tensor to Numpy Array using Cuda
                            
                                Modify JSON in Ansible
                            
                                Tensorflow Object Detection API - 'ValueError: anchor_strides must be a list with the same length as self._box_specs'
                            
                                Spotify API {'error': 'invalid_client'} Authorization Code Flow [400]
                            
                                How to encircle some pixels on a heat map with a continuous, not branched line using Python?
                            
                                How to specify Accept headers from rest_framework.test.Client?
                            
                                Project Euler # 11 Numpy way
                            
                                How to use TensorFlow tf.print with non capital p?
                            
                                Django Admin List Filter Remove All Option
                            
                                How to cut a list by specific item?
                            
                                How to save pandas to excel with different colors
                            
                                Cannot load mkl_intel_thread.dll on python executable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to assign random values from a list to a column in a pandas dataframe?

Tags:

python

loops

random

pandas

Jo Costa

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us