Replace cell values in each row of pandas column using for loop

Question

Please, help me understand my error. I'm trying to change one column in my .csv file. I have .csv file as following:

sku,name,code  
k1,aaa,886  
k2,bbb,898  
k3,ccc,342  
k4,ddd,503  
k5,eee,401

I want to replace "k" symbol with the "_" symbol in the "sku" column.
I wrote the code:

import sys  
import pandas as pd  
import numpy as np  
import datetime  

df = pd.read_csv('cat0.csv')  

for r in df['sku']:  
    r1 = r.replace('k', '_')  
    df['sku'] = r1  

print (df)

But the code inserts the last value in every row of the "sku" column. So I get:

  sku name  code
0  _5  aaa   886
1  _5  bbb   898
2  _5  ccc   342
3  _5  ddd   503
4  _5  eee   401

I want to get as following:

  sku name  code
0  _1  aaa   886
1  _2  bbb   898
2  _3  ccc   342
3  _4  ddd   503
4  _5  eee   401

Jan · Accepted Answer

You can use str.replace on the whole column:

from io import StringIO
import pandas as pd

data = """sku,name,code  
k1,aaa,886  
k2,bbb,898  
k3,ccc,342  
k4,ddd,503  
k5,eee,401"""

file = StringIO(data)

df = pd.read_csv(file)
df['sku'] = df['sku'].str.replace('k', '_')

print(df)

This yields

  sku name  code  
0  _1  aaa     886
1  _2  bbb     898
2  _3  ccc     342
3  _4  ddd     503
4  _5  eee     401

Zev · Answer

As @Jan mentioned, doing it by using df['sku'] = df['sku'].str.replace('k', '_') is the best/quickest way to do this.

However, to understand why you are getting the results you are and to present a way as close to how you were doing it as possible, you'd do:

import pandas as pd

df = pd.DataFrame(
    {
        'sku':["k1", "k2", "k3", "k4", "k5"], 
        'name': ["aaa", "bbb", "ccc", "ddd", "eee"], 
        'code':[886, 898,342,503,401]
    }, columns =["sku", "name", "code"]
)

for i, r in enumerate(df['sku']):  
    r1 = r.replace('k', '_')
    df.at[i, 'sku'] = r1

Which gives:

  sku name  code
0  _1  aaa   886
1  _2  bbb   898
2  _3  ccc   342
3  _4  ddd   503
4  _5  eee   401

In your code...

for r in df['sku']:  
    r1 = r.replace('k', '_')

...the issue is here:

    df['sku'] = r1

You are broadcasting your results to the entire column rather than just the row you are working on.

Replace cell values in each row of pandas column using for loop

Tags:

python

replace

python-3.x

pandas

csv

j.stalin

2 Answers

Jan

Zev

Recent Activity

Donate For Us

Replace cell values in each row of pandas column using for loop

Tags:

python

replace

python-3.x

pandas

csv

j.stalin

2 Answers

Jan

Zev

Related questions

Recent Activity

Donate For Us