Replace and duplicate string with a specific max count in pandas

Question

I have a dataset, df, that repeats a sequence for X amount of times. I would like to replace certain letters of this sequence and then repeat for a given max count.

Data

xy_pod  xy_pod  xy_pod  xy_pod
xy_pod  xy_pod  xy_pod  xy_pod
xy_pod  xy_pod  xy_pod  xy_pod

other letters where I would like to replace the 'xy' portion with:

   aa
   vee
   lee

Desired

xy_pod  xy_pod  xy_pod  xy_pod
xy_pod  xy_pod  xy_pod  xy_pod
xy_pod  xy_pod  xy_pod  xy_pod



aa_pod  aa_pod  aa_pod  aa_pod
aa_pod  aa_pod  aa_pod  aa_pod
aa_pod  aa_pod  aa_pod  aa_pod
    

vee_pod vee_pod vee_pod vee_pod
vee_pod vee_pod vee_pod vee_pod
vee_pod vee_pod vee_pod vee_pod


lee_pod lee_pod lee_pod lee_pod
lee_pod lee_pod lee_pod lee_pod
lee_pod lee_pod lee_pod lee_pod

Doing

df.replace(xy_pod, aa_pod, 12)
df.replace(aa_pod, vee_pod, 12)   
df.replace(vee_pod, lee_pod, 12)

This is very similar to the find and replace logic that excel offers. However, I am not sure how to specify the number of repetitions that I wish to occur. Also, how would I perform this for multiple sequences so that I do not have to perform the function for every new entry? Is there a more efficient way to do this?

Any suggestion or advice is appreciated

Scott Boston · Accepted Answer

Try this:

pd.concat([df]+[df.stack().str.replace('xy', i).unstack() for i in ['aa','vee', 'lll']])

Output:

         0        1        2        3
0   xy_pod   xy_pod   xy_pod   xy_pod
1   xy_pod   xy_pod   xy_pod   xy_pod
2   xy_pod   xy_pod   xy_pod   xy_pod
0   aa_pod   aa_pod   aa_pod   aa_pod
1   aa_pod   aa_pod   aa_pod   aa_pod
2   aa_pod   aa_pod   aa_pod   aa_pod
0  vee_pod  vee_pod  vee_pod  vee_pod
1  vee_pod  vee_pod  vee_pod  vee_pod
2  vee_pod  vee_pod  vee_pod  vee_pod
0  lll_pod  lll_pod  lll_pod  lll_pod
1  lll_pod  lll_pod  lll_pod  lll_pod
2  lll_pod  lll_pod  lll_pod  lll_pod

BENY · Answer

Looking for the 12 time count need stack the find the count

s = df.stack()
find_count = s.groupby(s.shift().ne(s).cumsum()).transform('count')
n = 12
out = s[find_count==n].replace({'xy':'aa'},regex=True).combine_first(s).unstack()
out
Out[227]: 
        0       1       2       3
0  aa_pod  aa_pod  aa_pod  aa_pod
1  aa_pod  aa_pod  aa_pod  aa_pod
2  aa_pod  aa_pod  aa_pod  aa_pod

Replace and duplicate string with a specific max count in pandas

Tags:

python

pandas

numpy

Lynn

2 Answers

Scott Boston

BENY

Recent Activity

Donate For Us

Replace and duplicate string with a specific max count in pandas

Tags:

python

pandas

numpy

Lynn

2 Answers

Scott Boston

BENY

Related questions

Recent Activity

Donate For Us