Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add leading zeros based on condition in python

I have a dataframe with 5 million rows. Let's say the dataframe looked like below:

>>> df = pd.DataFrame(data={"Random": "86 7639103627 96 32 1469476501".split()})
>>> df
       Random
0          86
1  7639103627
2          96
3          32
4  1469476501

Note that the Random column is stored as a string.

If the number in column Random has fewer than 9 digits, I want to add leading zeros to make it 9 digits. If the number has 9 or more digits, I want to add leading zeros to make it 20 digits.

what I have done is this:

for i in range(0,len(df['Random'])):
      if len(df['Random'][i]) < 9:
          df['Random'][i]=df['Random'][i].zfill(9)
      else:
           df['Random'][i]=df['Random'][i].zfill(20)

Since the number of rows is over 5 million, this process takes a lot of time! (performance was 5it/sec. Tested using tqdm, estimated time of completion was in days!).

Is there an easier and faster way of performing this task?

like image 982
Gary Avatar asked Nov 23 '25 03:11

Gary


1 Answers

Let us do np.where combine with zfill, alternative you can check with str.pad

df.Random=np.where(df.Random.str.len()<9,df.Random.str.zfill(9),df.Random.str.zfill(20))
df
Out[9]: 
                 Random
0             000000086
1  00000000007639103627
2             000000096
3             000000032
4  00000000001469476501
like image 170
BENY Avatar answered Nov 24 '25 20:11

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!